Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory usage when unserializing packed arrays of size 1 #7691

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

TysonAndre
Copy link
Contributor

Previously, PHP's unserialize() would unconditionally convert a packed array to
an associative(hashed) array to solve the problem of [0 => $v1, 'key' => $v2]
starting out as a packed array but becoming an associative array,
causing the raw zval pointer to the value for $v1 to have changed.

  • This can only happen when there's at least two elements, though.

Additionally, reduce memory usage when calling __unserialize with arrays of
size 0 or 1
(e.g. for user-defined functions that store the passed in array as a property)

I looked into repacking arrays of size 2+ after the unserializer no longer needed them, but ran into these issues:

  • (Probably) Worse cache locality of resulting mix of objects/arrays
  • Potential unsafety if unserialization handlers had access to the arrays
  • Longer time to unserialize overall, which would be worse for short-lived arrays

Benchmark

Before (PHP 8.2)

          bench_single_element(1000000): memory=417943120 bytes, create time 0.203, read time 0.043, free time 0.036, total time 0.282 result 499999500000
       bench_magic_unserialize(1000000): memory=482331728 bytes, create time 0.417, read time 0.114, free time 0.067, total time 0.598 result 499999500000

After (PHP 8.2)

          bench_single_element(1000000): memory=257943120 bytes, create time 0.156, read time 0.024, free time 0.027, total time 0.208 result 499999500000
       bench_magic_unserialize(1000000): memory=322331728 bytes, create time 0.368, read time 0.082, free time 0.047, total time 0.497 result 499999500000
<?php
function bench_single_element(int $N) {
    $values = [];
    for ($i = 0; $i < $N; $i++) {
        $values[] = [$i];
    }
    $ser = serialize($values);
    $start_time = hrtime(true)/1e9;
    $start_memory = memory_get_usage();
    $copy = unserialize($ser);
    $end_time = hrtime(true)/1e9;
    $end_memory = memory_get_usage();

    $total = 0;
    foreach ($copy as [$v]) {
        $total += $v;
    }
    $read_time = hrtime(true)/1e9;
    unset($copy);

    $free_time = hrtime(true)/1e9;
    $free_memory = memory_get_usage();

    printf("%30s(%d): memory=%d bytes, create time %.3f, read time %.3f, free time %.3f, total time %.3f result %d\n", __FUNCTION__, $N, $end_memory - $start_memory, $end_time-$start_time, $read_time-$end_time, $free_time - $read_time, $free_time - $start_time, $total);
}

class ListWrapper {
    public function __construct(public array $list) { }
    public function __serialize(): array { return $this->list; }
    public function __unserialize(array $list) { $this->list = $list; }
}

function bench_magic_unserialize(int $N) {
    $values = [];
    for ($i = 0; $i < $N; $i++) {
        $values[] = new ListWrapper([$i]);
    }
    $ser = serialize($values);
    $start_time = hrtime(true)/1e9;
    $start_memory = memory_get_usage();
    $copy = unserialize($ser);
    $end_time = hrtime(true)/1e9;
    $end_memory = memory_get_usage();

    $total = 0;
    foreach ($copy as $v) {
        $total += $v->list[0];
    }
    $read_time = hrtime(true)/1e9;
    unset($copy);

    $free_time = hrtime(true)/1e9;
    $free_memory = memory_get_usage();

    printf("%30s(%d): memory=%d bytes, create time %.3f, read time %.3f, free time %.3f, total time %.3f result %d\n", __FUNCTION__, $N, $end_memory - $start_memory, $end_time-$start_time, $read_time-$end_time, $free_time - $read_time, $free_time - $start_time, $total);
}
ini_set('memory_limit', '2G');
$N = 1000000;
bench_single_element($N);
bench_magic_unserialize($N);

Previously, PHP's unserialize() would unconditionally convert a packed array to
an associative(hashed) array to solve the problem of `[0 => $v1, 'key' => $v2]`
starting out as a packed array but becoming an associative array,
causing the raw zval pointer to the value for $v1 to have changed.

- This can only happen when there's at least two elements, though.

Additionally, reduce memory usage when calling `__unserialize` with arrays of
size 0 or 1
(e.g. for user-defined functions that store the passed in array as a property)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants