Skip to content

Promote unknown encoding warning to ValueError in Mbstring extension #5317

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

Girgias
Copy link
Member

@Girgias Girgias commented Mar 28, 2020

This promotes and standardizes the various invalid/unknown encoding warnings to ValueErrors.

I needed to change a couple of the internal API to allow the pass through of the argument position.

I didn't touch the Unsupported encoding warnings which appear in mb_ord() and mb_chr() functions yet as I'm not sure what to do with them because I think it would make more sense to integrate them in the main check and dispatch the Unsupported encoding consistently.

Copy link
Member

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems sensible to convert this to an Error. I might have gone on a small refactoring spree after seeing some stuff in here...

@Girgias Girgias force-pushed the mbstring-error-incorrect-encoding branch from 0aafb9b to 7e7e4fe Compare March 30, 2020 13:47
Comment on lines 22 to 30
--EXPECT--
*** Testing INI mbstring.internal_encoding: invalid encoding specified in INI ***
UTF-8
BAD
UTF-8
BAD
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikic not sure I understood your comment about mb_internal_encoding with an INI setting so I added this tests which well gives something kinda surprising :|

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the exception just ends up being ignored?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? looking at the mb_internal_encoding_ini_basic2.phpt test I'm not even sure that using mb_internal_encoding() actually changes the value of the INI setting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, mbstring.internal_encoding and mb_internal_encoding() use separate state :) There's default_charset, internal_encoding, mbstring.internal_encoding and mb_internal_encoding() that form a hierarchy, but are all separate. Go figure ;)

Looking at the code it looks like we just silently ignore incorrect encodings and instead assign UTF-8. Ugh.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funzies, maybe we should document this somewhere? Should I add a comment somewhere in mbstring or add a new file in the docs/ folder?

Promotes only the warnings where the encoding comes only from a string.
Functions which accept an array of encodings will be fixed at a later stage.

Closes GH-
@Girgias Girgias force-pushed the mbstring-error-incorrect-encoding branch from 7e7e4fe to c801e7c Compare March 31, 2020 12:57
@php-pulls php-pulls closed this in 90eeca2 Mar 31, 2020
@Girgias Girgias deleted the mbstring-error-incorrect-encoding branch March 31, 2020 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants