-
Notifications
You must be signed in to change notification settings - Fork 7.8k
Promote unknown encoding warning to ValueError in Mbstring extension #5317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems sensible to convert this to an Error. I might have gone on a small refactoring spree after seeing some stuff in here...
0aafb9b
to
7e7e4fe
Compare
--EXPECT-- | ||
*** Testing INI mbstring.internal_encoding: invalid encoding specified in INI *** | ||
UTF-8 | ||
BAD | ||
UTF-8 | ||
BAD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nikic not sure I understood your comment about mb_internal_encoding with an INI setting so I added this tests which well gives something kinda surprising :|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the exception just ends up being ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe? looking at the mb_internal_encoding_ini_basic2.phpt
test I'm not even sure that using mb_internal_encoding()
actually changes the value of the INI setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, mbstring.internal_encoding
and mb_internal_encoding()
use separate state :) There's default_charset
, internal_encoding
, mbstring.internal_encoding
and mb_internal_encoding()
that form a hierarchy, but are all separate. Go figure ;)
Looking at the code it looks like we just silently ignore incorrect encodings and instead assign UTF-8. Ugh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funzies, maybe we should document this somewhere? Should I add a comment somewhere in mbstring or add a new file in the docs/ folder?
Promotes only the warnings where the encoding comes only from a string. Functions which accept an array of encodings will be fixed at a later stage. Closes GH-
7e7e4fe
to
c801e7c
Compare
This promotes and standardizes the various invalid/unknown encoding warnings to ValueErrors.
I needed to change a couple of the internal API to allow the pass through of the argument position.
I didn't touch the
Unsupported encoding
warnings which appear inmb_ord()
andmb_chr()
functions yet as I'm not sure what to do with them because I think it would make more sense to integrate them in the main check and dispatch theUnsupported encoding
consistently.