Skip to content

Conversation

@iluuu1994
Copy link
Member

@iluuu1994 iluuu1994 commented Dec 1, 2025

The aim of this PR is twofold:

  • Reduce the number of highly similar TMP|VAR handlers
  • Avoid ZVAL_DEREF in most of these cases

This is achieved by guaranteeing that all zend_compile_expr() calls, as well as all other compile calls with BP_VAR_R, will result in a TMP variable. This implies that the result will not contain an IS_INDIRECT or IS_REFERENCE value, which was mostly already the case, with two exceptions:

  • Calls to return-by-reference functions. Because return-by-reference functions are quite rare, this is solved by delegating the DEREF to the RETURN_BY_REF handler, which will examine the stack to check whether the caller expects a VAR or TMP to understand whether the DEREF is needed.

  • By-reference assignments, including both $a = &$b, as well as [&$a] = $b. When the result of these expressions is used in a BP_VAR_R context, it will be passed to a new ZEND_DEREF opcode beforehand. This is exceptionally rare.

Preliminary testing shows a 1.1% wall time improvement in Symfony Demo and roughly 0.5% in Wordpress. Edit: Sadly I can now only measure a 0.15% improvement for Symfony, but 0.8% for Wordpress. Zend/bench.php improves by ~3% in my tests. There seems to be quite a bit of volatility involved, potentially in relation to binary layout. Regardless, I think this is unlikely to cause true slowdowns for code that doesn't use return-by-ref.

TODOs:

  • Verify this doesn't break important optimizations
  • Use zend_unwrap_reference(). I missed this function, I was looking for a macro.
  • Check why JIT i-count regresses. Edit: Symfony Demo now reduces jitted code by 0.35%, Wordpress by 0.02%.
  • Try to replace DEREF with QM_ASSIGN.
  • Add checks to R/IS compile-paths, asserting no VARs are generated.

@github-actions

This comment was marked as outdated.

The aim of this PR is twofold:

- Reduce the number of highly similar TMP|VAR handlers
- Avoid ZVAL_DEREF in most of these cases

This is achieved by guaranteeing that all zend_compile_expr() calls, as well as
all other compile calls with BP_VAR_R, will result in a TMP variable. This
implies that the result will not contain an IS_INDIRECT or IS_REFERENCE value,
which was mostly already the case, with two exceptions:

- Calls to return-by-reference functions. Because return-by-reference functions
  are quite rare, this is solved by delegating the DEREF to the RETURN_BY_REF
  handler, which will examine the stack to check whether the caller expects a
  VAR or TMP to understand whether the DEREF is needed.

- By-reference assignments, including both $a = &$b, as well as $a = [&$b]. When
  the result of these expressions is used in a BP_VAR_R context, it will be
  passed to a new ZEND_DEREF opcode beforehand. This is exceptionally rare.

Preliminary testing shows a 1.1% wall time improvement in Symfony Demo and
roughly 0.5% in Wordpress.
Works now, for some reason
@github-actions
Copy link

github-actions bot commented Dec 3, 2025

AWS x86_64 (c7i.24xl)

Attribute Value
Environment aws
Runner host
Instance type c7i.metal-24xl (dedicated)
Architecture x86_64
CPU 48 cores
CPU settings disabled deeper C-states, disabled turbo boost, disabled hyper-threading
RAM 188 GB
Kernel 6.1.158-178.288.amzn2023.x86_64
OS Amazon Linux 2023.9.20251117
GCC 14.2.1
Time 2025-12-02 23:39:44 UTC

Laravel 12.2.0 demo app - 100 consecutive runs, 50 warmups, 100 requests (sec)

PHP Min Max Std dev Rel std dev % Mean Mean diff % Median Median diff % Skew P-value Memory
PHP - baseline@64d8 0.45975 0.46849 0.00110 0.23% 0.46671 0.00% 0.46677 0.00% -4.600 0.999 44.29 MB
PHP - vm-early-deref 0.46401 0.46634 0.00045 0.10% 0.46518 -0.33% 0.46514 -0.35% 0.387 0.000 44.20 MB

Symfony 2.7.0 demo app - 100 consecutive runs, 50 warmups, 100 requests (sec)

PHP Min Max Std dev Rel std dev % Mean Mean diff % Median Median diff % Skew P-value Memory
PHP - baseline@64d8 0.74017 0.74572 0.00125 0.17% 0.74186 0.00% 0.74164 0.00% 1.222 0.999 40.49 MB
PHP - vm-early-deref 0.73742 0.74991 0.00167 0.23% 0.73960 -0.30% 0.73909 -0.34% 3.013 0.000 40.61 MB

Wordpress 6.2 main page - 100 consecutive runs, 20 warmups, 20 requests (sec)

PHP Min Max Std dev Rel std dev % Mean Mean diff % Median Median diff % Skew P-value Memory
PHP - baseline@64d8 0.57431 0.58907 0.00131 0.23% 0.57939 0.00% 0.57927 0.00% 3.617 0.999 44.05 MB
PHP - vm-early-deref 0.58042 0.59071 0.00140 0.24% 0.58203 0.46% 0.58177 0.43% 4.732 0.000 44.04 MB

bench.php - 100 consecutive runs, 10 warmups, 2 requests (sec)

PHP Min Max Std dev Rel std dev % Mean Mean diff % Median Median diff % Skew P-value Memory
PHP - baseline@64d8 0.42478 0.43729 0.00296 0.69% 0.42832 0.00% 0.42732 0.00% 1.726 0.999 26.98 MB
PHP - vm-early-deref 0.42704 0.44168 0.00292 0.68% 0.43045 0.50% 0.42962 0.54% 1.945 0.000 26.85 MB

micro_bench.php - 50 consecutive runs, 5 warmups, 1 request (sec)

PHP Min Max Std dev Rel std dev % Mean Mean diff % Median Median diff % Skew P-value Memory
PHP - baseline@64d8 1.29754 1.34894 0.00922 0.70% 1.31365 0.00% 1.31393 0.00% 0.946 0.997 21.23 MB
PHP - vm-early-deref 1.35848 1.38456 0.00620 0.45% 1.36862 4.18% 1.36759 4.08% 0.682 0.000 21.10 MB

The op1=IS_VAR spec was pretty much identical, and there are most likely already
better optimizations for QM_ASSIGN in place.
if ((t1 & (MAY_BE_ANY|MAY_BE_UNDEF)) == MAY_BE_ARRAY && MAY_BE_EMPTY_ONLY(t1)) {
return false;
}
return true;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YIELD_FROM from generators can throw if the generator is closed, even for empty arrays. So this optimization may be unsound.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not mind doing the inverse - changing YIELD_FROM to not throw on empty array, though...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking into account the "closed" generator, this is wrong.

@iluuu1994 iluuu1994 marked this pull request as ready for review December 4, 2025 21:14
@iluuu1994 iluuu1994 requested a review from dstogov as a code owner December 4, 2025 21:14
Copy link
Member

@dstogov dstogov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should work, but the effect is not great.
(I see 30KB reduction in PHP code size and very slight performance difference).

The path breaks Symfony demo with function JIT (probably because of missing changes in FETCH_(DIM|OBJ)_FUNC_ARG handlers). This needs to be fixed of course.

MAKE_NOP(opline);
++(*opt_count);
if (src->op1_type & (IS_VAR|IS_TMP_VAR)) {
src->opcode = ZEND_FREE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing QM_ASSIGN with FREE opens possibility to deeper application of the same optimization.

Comment on lines +168 to +174
case ZEND_YIELD_FROM: {
uint32_t t1 = OP1_INFO();
if ((t1 & (MAY_BE_ANY|MAY_BE_UNDEF)) == MAY_BE_ARRAY && MAY_BE_EMPTY_ONLY(t1)) {
return false;
}
return true;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows DCE pass to drop YIELD_FROM instruction at some cases. I think, it may be never dropped.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is not a very useful optimization, though see ext/opcache/tests/opt/sccp_032.phpt where YIELD_FROM is already elided through SCCP. I tried to not break any of the existing optimizations.

if ((t1 & (MAY_BE_ANY|MAY_BE_UNDEF)) == MAY_BE_ARRAY && MAY_BE_EMPTY_ONLY(t1)) {
return false;
}
return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking into account the "closed" generator, this is wrong.

Comment on lines +5958 to +5959
if (!prev_ex || !prev_ex->func || !ZEND_USER_CODE(prev_ex->func->type)) {
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means, we won't unwrap references when call from internal functions.
Probably this is OK, as this keeps the current behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that internal functions call zend_return_unwrap_ref() themselves. This is asserted in the VM, but should be documented accordingly.

Comment on lines +5967 to +5971
if (do_opline->opcode != ZEND_DO_FCALL
&& do_opline->opcode != ZEND_DO_FCALL_BY_NAME
&& do_opline->opcode != ZEND_DO_ICALL
&& do_opline->opcode != ZEND_DO_UCALL) {
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't need ICALL here.
I'm not sure if we can perform calls through trampolines from some other handlers (I don't remember).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed when called from an internal function. This is untested, I'll add a test to zend_test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants