src: limit foreground tasks draining loop#19987
src: limit foreground tasks draining loop#19987ulan wants to merge 5 commits intonodejs:masterfrom ulan:foreground-task-flushing
Conversation
Foreground tasks that repost themselves can force the draining loop to run indefinitely long without giving other tasks chance to run. This limits the foreground task draining loop to run only the tasks that were in the tasks queue at the beginning of the loop. Fixes: #19937
src/node_platform.h
Outdated
| int unref(); | ||
|
|
||
| // Returns true iff work was dispatched or executed. | ||
| // New tasks that are posted during flushing of the queue are not run. |
There was a problem hiding this comment.
Tiny nit: Maybe be more specific here and say that they are not run as part of this iteration, rather than implying that they aren’t run at all?
|
hmmm... definitely +1 on this but should this be semver-major? |
|
There are some CI failures. it's not clear if they are unrelated. |
|
Not sure but there might be issues with Windows. |
|
Windows failures seem related. |
|
Thanks. I'll try to reproduce the windows failure and debug it. |
|
The windows failure should be fixed now. The problem was caused by the incorrect repost count in the new test. That resulted in one left-over task that tried to access Environment after its destruction. |
|
sequential/test-inspector-stop-profile-after-done is failing on ubuntu1710-x64 on all three CI jobs. Unfortunately it doesn't reproduce on my machine. I will have to find ubuntu 17.10. |
|
@ulan that is a known flake. It is independent from this PR and you can safely ignore it :-) |
|
Guess this is ready then? |
bnoordhuis
left a comment
There was a problem hiding this comment.
Thanks, LGTM. CI: https://ci.nodejs.org/job/node-test-pull-request/14369/
|
Ok, I know that |
|
@apapirovski yes, that seems strange. Could it be caused by a "unlucky" base revision? (I don't know if CI rebases the patch to the tip of tree for each run). |
|
After further inspection, I think it might be related because the test uses the sampling heap profiler, which internally relies on V8's allocation observer. Incremental marker also uses the allocation observer. There is probably some timing change that causing the test to fail more frequently. I will investigate. |
|
Sampling heap profiler theory was incorrect. The test is about CPU profiler. I found a bug in the initial patch: |
apapirovski
left a comment
There was a problem hiding this comment.
Thank you for following up on this.
src/node_platform.h
Outdated
| // Returns true iff work was dispatched or executed. | ||
| // New tasks that are posted during flushing of the queue are postponed until | ||
| // the next flushing. | ||
| // Returns true iff work was dispatched or executed. New tasks that are |
There was a problem hiding this comment.
Yeah but I know the meaning and I didn't even think of it until you mentioned it... but maybe that just reflects badly on me... 🤔 😆
There was a problem hiding this comment.
I guess what I'm saying: I don't think this is the place for iff. It doesn't make it clearer.
But I'm not going to object if someone prefers this. It's a low priority thing for me.
There was a problem hiding this comment.
Done. The iff was there before my change. I agree that if seems slightly better.
src/node_platform.h
Outdated
| v8::TracingController* GetTracingController() override; | ||
|
|
||
| void FlushForegroundTasks(v8::Isolate* isolate); | ||
| // Returns true iff work was dispatched or executed. New tasks that are |
|
There are two windows failures in the recent CI job: Both failures seem to me unrelated to the PR. |
|
CI https://ci.nodejs.org/job/node-test-pull-request/14440/ @ulan there are often some flakes. |
Foreground tasks that repost themselves can force the draining loop to run indefinitely long without giving other tasks chance to run. This limits the foreground task draining loop to run only the tasks that were in the tasks queue at the beginning of the loop. PR-URL: #19987 Fixes: #19937 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Yang Guo <yangguo@chromium.org> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Anatoli Papirovski <apapirovski@mac.com> Reviewed-By: Khaidi Chu <i@2333.moe>
|
@apapirovski thanks for landing! |
| running_nested_loop_ = true; | ||
| while (!terminated_ && channel_->waitForFrontendMessage()) { | ||
| platform_->FlushForegroundTasks(env_->isolate()); | ||
| while (platform_->FlushForegroundTasks(env_->isolate())) {} |
There was a problem hiding this comment.
Doesn't this line preserve the bug? Because, even though you swap the queue before draining it, you're still running this loop forever if new tasks are constantly added to the original queue
There was a problem hiding this comment.
The inspector posts foreground tasks and requires that they are all processed before going to the outer loop.
AFAIK this code runs only when the inspector is active and the program is paused. The normal libuv tasks are not processed here. Latency shouldn't be an issue.
There was a problem hiding this comment.
My understanding is that the original bug this is trying to fix (#19937) is that there are cases where foreground tasks can add additional tasks to the queue. The bug was fixed by freezing the queue inside of FlushForegroundTasks, but this line of code I'm commenting on appears to loop THAT CALL so that the freeze fix doesn't actually help anything. It still will run forever, if foreground tasks add themselves.
Unless there's more than one place where FlushForegroundTasks is being called, and that's not an issue for this line?
There was a problem hiding this comment.
There is certainly more than one place. This particular line addresses a very specific interaction.
Foreground tasks that repost themselves can force the draining loop to run indefinitely long without giving other tasks chance to run. This limits the foreground task draining loop to run only the tasks that were in the tasks queue at the beginning of the loop. PR-URL: #19987 Fixes: #19937 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Yang Guo <yangguo@chromium.org> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Anatoli Papirovski <apapirovski@mac.com> Reviewed-By: Khaidi Chu <i@2333.moe>
Foreground tasks that repost themselves can force the draining loop to run indefinitely long without giving other tasks chance to run. This limits the foreground task draining loop to run only the tasks that were in the tasks queue at the beginning of the loop. PR-URL: nodejs#19987 Fixes: nodejs#19937 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Yang Guo <yangguo@chromium.org> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Anatoli Papirovski <apapirovski@mac.com> Reviewed-By: Khaidi Chu <i@2333.moe>
Foreground tasks that repost themselves can force the draining loop
to run indefinitely long without giving other tasks chance to run.
This limits the foreground task draining loop to run only the tasks
that were in the tasks queue at the beginning of the loop.
Fixes: #19937
Checklist
make -j4 test(UNIX), orvcbuild test(Windows) passes