Skip to content

random: Validate that the arrays do not contain extra elements when unserializing #9458

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 5, 2022

Conversation

TimWolla
Copy link
Member

This adds validation to the unserializing logic in ext/random to ensure the arrays contain exactly the expected members, not additional unknown members.

This ensures that an undetected serialization error is clear identifiable in the output.
@@ -203,6 +203,11 @@ static bool unserialize(php_random_status *status, HashTable *data)
php_random_status_state_mt19937 *s = status->state;
zval *t;

/* Verify the expected number of elements, this implicitly ensures that no additional elements are present. */
if (zend_hash_num_elements(data) != (MT_N + 2)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lenient validation logic was a life-saver in case of SplFixedArray, otherwise my changes would have caused breaking changes between PHP versions. If ext/random engines will have new state information in a future PHP version, then unserializing these classes will be unnecessary in a platform-agnostic way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, but on the contrary I believe that strict validation makes this kind of upgrade easier:

  1. I strongly doubt that the engines are ever going to change, as they are “tied” to some external specification. Mt19937 is Mt19937, the state is well-defined.
  2. If some additional fields are in fact necessary in a future version, then I can be fairly sure that any existing serialized data will not yet include those fields with garbage values, because it was rejected previously.
  3. Serialization compatibility would not really exist even when ignoring unknown fields: Older PHP versions are not able to make sense of those out of the box, thus possible interpreting the unserialized instance differently, introducing a behavioral change. This is especially important for the random engiens whose whole selling point is generating repeatable sequences

Instead I would treat incompatible serialization output between PHP x.y and PHP x.(y+1) as a regular bug. The solution would then be adding a patch to the older PHP version to accept and ignore the newly added fields, but only once it is clear what the semantics of the newly added fields are!

Copy link
Contributor

@zeriyoshi zeriyoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the detailed description!

@TimWolla TimWolla merged commit ddf7a5d into php:PHP-8.2 Sep 5, 2022
@TimWolla TimWolla deleted the random-unserialize-strict branch September 5, 2022 15:33
TimWolla added a commit that referenced this pull request Sep 5, 2022
* PHP-8.2:
  random: Validate that the arrays do not contain extra elements when unserializing (#9458)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants