process,worker: fix process.exitCode handling for fatalException #21739
process,worker: fix process.exitCode handling for fatalException #21739lundibundi wants to merge 4 commits intonodejs:masterfrom
Conversation
lib/internal/bootstrap/node.js
Outdated
There was a problem hiding this comment.
If user code happens to set process.exitCode prior to this, this will override the user provided value. It should likely only set the value if it is not already set
There was a problem hiding this comment.
I do believe this is not the case, as uncaughtException has its own code 1, so it should be correct to override whatever was the code before. Though this is probably a semver-major because of this, but that shouldn't be a problem imo.
There was a problem hiding this comment.
hmm... I can definitely see the logic on it but I definitely don't like overriding user provided values unexpectedly. If we go with this, can I ask that a note be added to the documentation along with a code comment indicating why it's ok to silently override any user provided value here?
There was a problem hiding this comment.
That's fine with me. But where should I put a documentation?
Also, it is already noted that unhandledException has code 1, so this will basically enforce this, so that any previously set code will not be used, only the ones set in 'unhandledException' callback, 'exit' callback etc.
There was a problem hiding this comment.
But where should I put a documentation?
In the description for the uncaughtException event in docs/api/process.md. The documentation currently does not say anything about the process exit code. A note there that, should the event not be handled, the process will exit with exitCode = 1 even if the user had previously set a process.exitCode value.
There was a problem hiding this comment.
(btw, I see this as a limitation in the current documentation and not something that is introduced by this PR)
There was a problem hiding this comment.
even if the user had previously set a process.exitCode value.
Oh, I thought that's it's a given that if some new error happen exitCode will be replaced with the value of the error and just thought that it was undocumented. The fact that you wasn't able to change the code is indeed a strange thing. Anyway, I'll add the doc soon.
src/node.cc
Outdated
There was a problem hiding this comment.
nit: 4 space indent in C/C++ code
There was a problem hiding this comment.
Oops, thx. I thought linter would catch this 🤔.
There was a problem hiding this comment.
@lundibundi Sadly, our C++ linter is very outdated and not very actively maintained … we do have https://github.com/nodejs/node/blob/master/CPP_STYLE_GUIDE.md, if it helps?
src/node.cc
Outdated
There was a problem hiding this comment.
Hmm... this should likely only emit exit(1) if code is undefined. It should be legitimate for an uncaught exception handler to set process.exitCode = 0 and have that be the actual exit code.
|
Also, I kind of don't like code duplication in both tests (for process and worker), so will it be okay to extract test cases (child1,2,3 etc) into test/fixtures? |
9109a57 to
1acb425
Compare
src/node.cc
Outdated
There was a problem hiding this comment.
This crashes for weird edge cases like this:
process.on("exit", () => {
process.exitCode = { [Symbol.toPrimitive]() { throw new Error(); } };
})Maybe exit with 1 if code.IsEmpty() || !code->IsInt32() and use a direction conversion to v8::Int32 here, like this:
Line 2134 in 1f16758
Does that sound okay?
There was a problem hiding this comment.
Wow, that's pretty neat. Though, one question, is it safe to just Local<Value> code = process_object->Get(env->context(), exit_code).ToLocalChecked();? Or should I do it step by step (empty check, then convert and int32 check).
There was a problem hiding this comment.
@lundibundi That operation can still fail if userland code were to install a getter that throws … obviously not something it should do, but yes, I’d prefer to do it step-by-step.
(Generally, we’ll need to shift a lot of our error handling code away from using ToLocalChecked()/ToChecked(), because worker.terminate() can make just about anything throw that calls JS code…)
There was a problem hiding this comment.
Ok, thanks, I'll update it soon.
src/node.cc
Outdated
There was a problem hiding this comment.
I believe you when you say the test was failing before, but do you know why this was happening? Does it point to a larger issue?
There was a problem hiding this comment.
As I understand the issue here is that there is no synchronization between worker-exit and worker-fatalException-exit (links at the end of the first post) and as a result we have
- FatalException -> fatal callback called in js land -> emit exit from there
- worker undestands that it is finishing, exits its loop and calls
EmitExit
Therefore we get 2 calls to 'exit' listeners. I'm not sure if 'just muting' second exit event is a good solution, hence my note in the end.
I believe you when you say the test was failing before
I'm not sure what you mean? Also I forgot to add a description of a test case for the bug (it is here in the tests and I'll update the first post soon)
There was a problem hiding this comment.
to be honest ,you are really so konwledgable
There was a problem hiding this comment.
@lundibundi I see, that makes sense. I think I’d have a slight preference to handle this using a flag on the Environment object, since process._exiting can just be modified by userland code.
There was a problem hiding this comment.
Oh, that's surely better. I'll try to implement that, though there may be some problems as _exiting is actually set from _fatalException handler in js, so with _exiting it is set if no handler and **before** 'exit' callback, but with env it will be set if no handler and **after** all 'exit' callbacks due to the way fatalException is handled in js.
There was a problem hiding this comment.
@addaleax Well, as I was afraid of, worker finish coincides with FatalException (after FatalException calls js handler, worker understands that it has finished and starts its own sequence). So either we
- change the way they work with each other (no ideas here, add exitCode lock in env and block worker until FatalExeption finish?)
- change how _fatalException work (maybe split in pre that will check for uncaughtException handler and actually success and failure routes, though this looks kind of over the top and introduce additional cpp-js calls)
- maybe check for 'uncaughtException' listeners in the cpp-land beforehand if that's possible, but this one looks like a hack and not a solution
- or use
_exitingfor now
Obviously the latter looks okay, but this surely indicates the problem and may need additional research.
P.s: In order to investigate this I added a few logs in code (as Debug were not enough) and run a version on test-worker-uncaught-exception.js without asserts, here is the output
Fatal exception calling js handler
Worker exited main loop
Worker emit exit
EmitExit start
EmitExit emit 'exit'
on error received: foo
Exit callback called twice in worker
exited with 1
EmitExit start
EmitExit emit 'exit'
Aclually I added env->exiting and appropriate checks/sets in EmitExit, FatalException here but the order of execution prevents it from running correctly.
There was a problem hiding this comment.
@lundibundi I think there’s a bigger bug here – in test-worker-uncaught-exception.js, the worker does not exit because of the uncaught exception, but because there’s no work to do after it. :/
I think we want a process.exit() call at the end of the if (!caught) { block?
There was a problem hiding this comment.
@addaleax oh, that is indeed true, great catch thanks.
I think adding a Timeout? with assert.fail should be good enough to ensure that the worker exits. Though I'm not sure if timeout is a good idea (flakiness and such), is there any better way?
Yeah process.exit() seems to work just fine there.
There was a problem hiding this comment.
@lundibundi We could try to start some async operation before the uncaught throw and see whether it’s executed? That latter part might be tricky, but that would be the general idea, I think
@lundibundi You can do that if it makes the most sense to you, yes :) |
1acb425 to
6ce385b
Compare
src/node.cc
Outdated
There was a problem hiding this comment.
This is not really important, but I’m kinda hoping you’ll stick around after your currently open PRs, so as a tip: I personally find it slightly annoying to first have a MaybeLocal<Something> and then conditionally converting it to a Local<Value>.
What I tend to do then is something along these lines:
Local<Value> value;
if (!object->Get(context, key).ToLocal(&value) || !value->IsInt32())
exit(1);
exit(value.As<Int32>()->Value());I know it’s not an exact match to this situation but I hope it’s clear enough what I mean. :)
There was a problem hiding this comment.
Well, MaybeLocal looks kind of limited. Though I like this neat way of using sce.
src/node.cc
Outdated
There was a problem hiding this comment.
@lundibundi I see, that makes sense. I think I’d have a slight preference to handle this using a flag on the Environment object, since process._exiting can just be modified by userland code.
6ce385b to
773c9d8
Compare
773c9d8 to
5eb1a23
Compare
|
@addaleax I've cleaned up the commits. I've left |
addaleax
left a comment
There was a problem hiding this comment.
Really digging what you put together here. Thank you a lot!
Would you be interested in joining the @nodejs/workers team? It doesn’t come with any responsibilities, it’s just people who get notified for issues/PRs related to Worker code. :)
* set process.exitCode before calling 'exit' handlers so that there will not be a situation where process.exitCode !== code in 'exit' callback during uncaughtException handling * don't ignore process.exitCode set in 'exit' callback when failed with uncaughtException and there is no uncaughtException listener
Now we set it before the exit event, this allows to change the code inside the exit event (event with uncaughtException), therefore setting exitCode in worker is no longer needed.
Previously even after uncaught exception the worker would continue to execute until there is no more work to do.
5eb1a23 to
6a993f1
Compare
|
@addaleax Thanks for your support 😃. |
|
Landed in 998f9ff...7c2925e, thanks for the work! 🎉 |
* set process.exitCode before calling 'exit' handlers so that there will not be a situation where process.exitCode !== code in 'exit' callback during uncaughtException handling * don't ignore process.exitCode set in 'exit' callback when failed with uncaughtException and there is no uncaughtException listener PR-URL: #21739 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
PR-URL: #21739 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Previously even after uncaught exception the worker would continue to execute until there is no more work to do. PR-URL: #21739 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
* set process.exitCode before calling 'exit' handlers so that there will not be a situation where process.exitCode !== code in 'exit' callback during uncaughtException handling * don't ignore process.exitCode set in 'exit' callback when failed with uncaughtException and there is no uncaughtException listener PR-URL: #21739 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
PR-URL: #21739 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Previously even after uncaught exception the worker would continue to execute until there is no more work to do. PR-URL: #21739 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
set process.exitCode before calling 'exit' handlers so that there will
not be a situation where process.exitCode !== code in 'exit' callback
during uncaughtException handling
don't ignore process.exitCode set in 'exit' callback when failed with
uncaughtException and there is no uncaughtException listener
fix duplicate call of 'exit' callbacks in case of uncaught exception in worker
Checklist
make -j4 test(UNIX) passesThis PR depends on #21713 for workers test.I've found a bug in workers where in case on uncaughtException 'exit' event is actually called twice. I think this is due to both, having
_fatalExceptioncalled presumably viaFatalException()and usual exit from worker as I understand here which results in 2 calls to 'exit' callbacks.I have fixed it with second commit, but I'm not sure if it's a correct fix, so awaiting feedback.
Edit: The case for the bug is when worker exits with unhandled exception and there is an 'exit' event listener and no 'unhandledException' listeners. This way worker's local 'exit' callbacks will be called twice. (see test-worker-uncaught-exception.js). Current node 10.6.0 always fails on this.
/cc @addaleax