-
Notifications
You must be signed in to change notification settings - Fork 8k
VAR|TMP overhaul #20628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
VAR|TMP overhaul #20628
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
The aim of this PR is twofold: - Reduce the number of highly similar TMP|VAR handlers - Avoid ZVAL_DEREF in most of these cases This is achieved by guaranteeing that all zend_compile_expr() calls, as well as all other compile calls with BP_VAR_R, will result in a TMP variable. This implies that the result will not contain an IS_INDIRECT or IS_REFERENCE value, which was mostly already the case, with two exceptions: - Calls to return-by-reference functions. Because return-by-reference functions are quite rare, this is solved by delegating the DEREF to the RETURN_BY_REF handler, which will examine the stack to check whether the caller expects a VAR or TMP to understand whether the DEREF is needed. - By-reference assignments, including both $a = &$b, as well as $a = [&$b]. When the result of these expressions is used in a BP_VAR_R context, it will be passed to a new ZEND_DEREF opcode beforehand. This is exceptionally rare. Preliminary testing shows a 1.1% wall time improvement in Symfony Demo and roughly 0.5% in Wordpress.
Works now, for some reason
2a3d9dd to
9ddff66
Compare
AWS x86_64 (c7i.24xl)
Laravel 12.2.0 demo app - 100 consecutive runs, 50 warmups, 100 requests (sec)
Symfony 2.7.0 demo app - 100 consecutive runs, 50 warmups, 100 requests (sec)
Wordpress 6.2 main page - 100 consecutive runs, 20 warmups, 20 requests (sec)
bench.php - 100 consecutive runs, 10 warmups, 2 requests (sec)
micro_bench.php - 50 consecutive runs, 5 warmups, 1 request (sec)
|
The op1=IS_VAR spec was pretty much identical, and there are most likely already better optimizations for QM_ASSIGN in place.
0e52da0 to
4ad5861
Compare
| if ((t1 & (MAY_BE_ANY|MAY_BE_UNDEF)) == MAY_BE_ARRAY && MAY_BE_EMPTY_ONLY(t1)) { | ||
| return false; | ||
| } | ||
| return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YIELD_FROM from generators can throw if the generator is closed, even for empty arrays. So this optimization may be unsound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not mind doing the inverse - changing YIELD_FROM to not throw on empty array, though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking into account the "closed" generator, this is wrong.
dstogov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work, but the effect is not great.
(I see 30KB reduction in PHP code size and very slight performance difference).
The path breaks Symfony demo with function JIT (probably because of missing changes in FETCH_(DIM|OBJ)_FUNC_ARG handlers). This needs to be fixed of course.
| MAKE_NOP(opline); | ||
| ++(*opt_count); | ||
| if (src->op1_type & (IS_VAR|IS_TMP_VAR)) { | ||
| src->opcode = ZEND_FREE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replacing QM_ASSIGN with FREE opens possibility to deeper application of the same optimization.
| case ZEND_YIELD_FROM: { | ||
| uint32_t t1 = OP1_INFO(); | ||
| if ((t1 & (MAY_BE_ANY|MAY_BE_UNDEF)) == MAY_BE_ARRAY && MAY_BE_EMPTY_ONLY(t1)) { | ||
| return false; | ||
| } | ||
| return true; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allows DCE pass to drop YIELD_FROM instruction at some cases. I think, it may be never dropped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this is not a very useful optimization, though see ext/opcache/tests/opt/sccp_032.phpt where YIELD_FROM is already elided through SCCP. I tried to not break any of the existing optimizations.
| if ((t1 & (MAY_BE_ANY|MAY_BE_UNDEF)) == MAY_BE_ARRAY && MAY_BE_EMPTY_ONLY(t1)) { | ||
| return false; | ||
| } | ||
| return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking into account the "closed" generator, this is wrong.
| if (!prev_ex || !prev_ex->func || !ZEND_USER_CODE(prev_ex->func->type)) { | ||
| return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means, we won't unwrap references when call from internal functions.
Probably this is OK, as this keeps the current behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that internal functions call zend_return_unwrap_ref() themselves. This is asserted in the VM, but should be documented accordingly.
| if (do_opline->opcode != ZEND_DO_FCALL | ||
| && do_opline->opcode != ZEND_DO_FCALL_BY_NAME | ||
| && do_opline->opcode != ZEND_DO_ICALL | ||
| && do_opline->opcode != ZEND_DO_UCALL) { | ||
| return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't need ICALL here.
I'm not sure if we can perform calls through trampolines from some other handlers (I don't remember).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's needed when called from an internal function. This is untested, I'll add a test to zend_test.
The aim of this PR is twofold:
This is achieved by guaranteeing that all zend_compile_expr() calls, as well as all other compile calls with BP_VAR_R, will result in a TMP variable. This implies that the result will not contain an IS_INDIRECT or IS_REFERENCE value, which was mostly already the case, with two exceptions:
Calls to return-by-reference functions. Because return-by-reference functions are quite rare, this is solved by delegating the DEREF to the RETURN_BY_REF handler, which will examine the stack to check whether the caller expects a VAR or TMP to understand whether the DEREF is needed.
By-reference assignments, including both
$a = &$b, as well as[&$a] = $b. When the result of these expressions is used in a BP_VAR_R context, it will be passed to a new ZEND_DEREF opcode beforehand. This is exceptionally rare.Preliminary testing shows a 1.1% wall time improvement in Symfony Demo and roughly 0.5% in Wordpress. Edit: Sadly I can now only measure a 0.15% improvement for Symfony, but 0.8% for Wordpress. Zend/bench.php improves by ~3% in my tests. There seems to be quite a bit of volatility involved, potentially in relation to binary layout. Regardless, I think this is unlikely to cause true slowdowns for code that doesn't use return-by-ref.
TODOs:
zend_unwrap_reference(). I missed this function, I was looking for a macro.DEREFwithQM_ASSIGN.R/IScompile-paths, asserting noVARs are generated.