Skip to content

Check parameters on compact() and throw TypeError if not string or array of strings #6921

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

dwgebler
Copy link
Contributor

@dwgebler dwgebler commented Apr 28, 2021

compact() is documented (https://www.php.net/manual/en/function.compact) as a variadic function accepting parameters which are strings or arrays of strings referencing defined symbols.

In actuality, passing nonsense parameters e.g. compact(true, 42) merely returns an empty array. I propose throwing a TypeError in these cases, to prevent silent bugs.

Copy link
Member

@Girgias Girgias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this can land as is in PHP 8.1, however an E_WARNING which will be promoted in PHP 9 will for sure be possible.

Might need to contact the internals list about this

@dwgebler
Copy link
Contributor Author

I don't think this can land as is in PHP 8.1, however an E_WARNING which will be promoted in PHP 9 will for sure be possible.

Might need to contact the internals list about this

I'd rather it was a TypeError straight up in 8.1 if no one has any significant objection to that. But of course if there's a set policy for change progression in PHP releases whereby it's as simple as we don't ever do this, I am more than happy to change it to a good old fashioned E_WARNING.

@guilliamxavier
Copy link
Contributor

guilliamxavier commented Apr 28, 2021

Shouldn't it error only with strict_types=1? like e.g. https://3v4l.org/AMlZX vs https://3v4l.org/K3cmC

Also I wonder about implicit coercion (with strict_types=0 i.e. by default), e.g. compact(true, 42) vs compact("1", "42") https://3v4l.org/5uQcK 🤔

@Girgias
Copy link
Member

Girgias commented Apr 28, 2021

Shouldn't it error only with strict_types=1? like e.g. https://3v4l.org/AMlZX vs https://3v4l.org/K3cmC

Also I wonder about implicit coercion (with strict_types=0 i.e. by default), e.g. compact(true, 42) vs compact("1", "42") https://3v4l.org/5uQcK thinking

We don't really have the internal tooling to do that, there is the possibility of using ZPP for a string|array union but the values of the array would be strict type checked all the time anyway.

@dwgebler
Copy link
Contributor Author

dwgebler commented Apr 28, 2021 via email

@dwgebler dwgebler requested a review from Girgias April 28, 2021 16:30
Copy link
Member

@Girgias Girgias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks fine, but once again I don't know if a TypeError can just be introduced like that. Ideally this would have been caught for PHP 8.0.

@dwgebler
Copy link
Contributor Author

@Girgias cheers for the pointers, it's always good to learn more about the engine from more seasoned contributors, re: TypeError vs E_WARNING, I wonder if it's enough to hope @nikic or @Crell or any other significant voices are happy to just chime in on the PR? Somehow I doubt the thread about this one on the mailing list is going to turn in to a hotbed of discussion :)

@guilliamxavier
Copy link
Contributor

Shouldn't it error only with strict_types=1? like e.g. https://3v4l.org/AMlZX vs https://3v4l.org/K3cmC
Also I wonder about implicit coercion (with strict_types=0 i.e. by default), e.g. compact(true, 42) vs compact("1", "42") https://3v4l.org/5uQcK thinking

We don't really have the internal tooling to do that, there is the possibility of using ZPP for a string|array union but the values of the array would be strict type checked all the time anyway.

The parameters are not coerced by the implementation of compact (as your last 34vl link demonstrates) - the nature of the problem is compact's documented parameter and behaviour is (...array<string>|string) but in reality it's (...mixed) - and it's the weirdness highlighted by your example that my change will prevent with a TypeError :)

So there's no way to make the "reality" match the "documented"? 😢

@dwgebler
Copy link
Contributor Author

So there's no way to make the "reality" match the "documented"? 😢

Not until we get typed arrays 😄 but at least throwing some explicit error prevents silent bugs - the reason I thought to do this one is because there was an occasion recently, it was one of those things you spend ages trying to hunt down a seemingly obscure bug in your code, only to eventually slap yourself when you realise it was something so obvious, yet so easy to do and so hard to spot. In my case, I had absent-mindedly specified the variables themselves as parameters to compact instead of their names.

@Girgias
Copy link
Member

Girgias commented Apr 28, 2021

Shouldn't it error only with strict_types=1? like e.g. https://3v4l.org/AMlZX vs https://3v4l.org/K3cmC
Also I wonder about implicit coercion (with strict_types=0 i.e. by default), e.g. compact(true, 42) vs compact("1", "42") https://3v4l.org/5uQcK thinking

We don't really have the internal tooling to do that, there is the possibility of using ZPP for a string|array union but the values of the array would be strict type checked all the time anyway.

The parameters are not coerced by the implementation of compact (as your last 34vl link demonstrates) - the nature of the problem is compact's documented parameter and behaviour is (...array<string>|string) but in reality it's (...mixed) - and it's the weirdness highlighted by your example that my change will prevent with a TypeError :)

So there's no way to make the "reality" match the "documented"? cry

There always is a way, it's just extremely clunky and last time I thought about doing that @nikic was far from convinced.

The main issue is that variadics don't get parsed individually by ZPP it's just an array of zvals, and for the rare cases where variadics are used with scalars might as well be a bit stricter.
(the trick is to check ZEND_ARG_USES_STRICT_TYPES() if in a PHP_FUNCTION or ZEND_CALL_USES_STRICT_TYPES(EG(current_execute_data)) more generally, buuuuut it's ugly and you would need to keep up with all the changes to coercive typing mode)

@guilliamxavier
Copy link
Contributor

I see, thanks for explaining. Anyway there are other issues like get_defined_vars() having inaccessible numeric string keys https://3v4l.org/a67Ka so whatever can be done here will be better than nothing ^^

@nikic
Copy link
Member

nikic commented Apr 30, 2021

There always is a way, it's just extremely clunky and last time I thought about doing that @nikic was far from convinced.

The main issue is that variadics don't get parsed individually by ZPP it's just an array of zvals, and for the rare cases where variadics are used with scalars might as well be a bit stricter.
(the trick is to check ZEND_ARG_USES_STRICT_TYPES() if in a PHP_FUNCTION or ZEND_CALL_USES_STRICT_TYPES(EG(current_execute_data)) more generally, buuuuut it's ugly and you would need to keep up with all the changes to coercive typing mode)

As these are actual function arguments (and not, say, the contents of an options array), it's okay to make them follow strict_types semantics. We just don't have ZPP support for this. You'd need to explicitly call zend_parse_arg_array_ht_or_str and then zend_wrong_parameter_error on error.

@dwgebler
Copy link
Contributor Author

As these are actual function arguments (and not, say, the contents of an options array), it's okay to make them follow strict_types semantics.

So @nikic are you suggesting we make that change in this PR, or happy with it as-is?

@nikic
Copy link
Member

nikic commented May 6, 2021

I'm not really sure to be honest. I think handling the arguments as array|string... would be the principled way to approach this -- PHP does allow you to do things like ${1} = 2, so it seems somewhat reasonable that compact() can do the same in weak typing mode. At the same time, it's very likely to be a programmer error, and doing it this way means that people using weak types won't get a nice error. Then again, they're using weak types...

@dwgebler
Copy link
Contributor Author

dwgebler commented May 6, 2021

I'm not really sure to be honest. I think handling the arguments as array|string... would be the principled way to approach this -- PHP does allow you to do things like ${1} = 2, so it seems somewhat reasonable that compact() can do the same in weak typing mode. At the same time, it's very likely to be a programmer error, and doing it this way means that people using weak types won't get a nice error. Then again, they're using weak types...

Hmm, yeah that would be an improvement, OTOH compact has never supported coercing its parameters anyway so it stands that right now, compact(1) is a programmer mistake or otherwise meaningless in all cases.

There is another argument, that something like compact("foo", "bar", [1, "baz"]) is legitimate, where the expectation of the programmer is that the returned array will contain keys ["foo", "bar", "baz"]. But I think this is very much an edge case and the probability of silent bugs from swallowing non-string values outweighs any BC edge cases in making the change worthwhile.

I'd push for the PR as it is to be accepted/merged and compact() to be revisited for PHP 9, but I defer to your expertise. @nikic

@ramsey ramsey added the Feature label May 7, 2021
@ramsey
Copy link
Member

ramsey commented May 7, 2021

Linking to mailing list thread for reference: https://externals.io/message/114229

@ramsey
Copy link
Member

ramsey commented May 8, 2021

Thanks, @dwgebler!

@Girgias, would you might re-reviewing?

@ramsey ramsey added this to the PHP 8.1 milestone May 8, 2021
@dwgebler
Copy link
Contributor Author

@ramsey are you able to retrigger the build? Updated the tests in accordance with the final comments and not sure why the build's failing, it's not related to these changes or tests as far as I can see?

} ZEND_HASH_FOREACH_END();
if (Z_REFCOUNTED_P(entry)) {
Z_UNPROTECT_RECURSION_P(entry);
}
} else {
php_error_docref(NULL, E_WARNING, "Argument #%d must be string or array of strings, %s given", pos, zend_zval_type_name(entry));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message is a bit imprecise when it comes to values in arrays, but I think it's good enough to go on...

@nikic nikic merged commit f513987 into php:master May 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants