-
Notifications
You must be signed in to change notification settings - Fork 535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions on API contract regarding onComplete/onError #52
Comments
Yes. The publisher can even call one of those methods instead of
This is a good question. The original idea was that |
Even if In the Rx guidelines, section 6.4 says "protect calls to user code from within an operator" - this means that if you pass callbacks triggering exceptions to one of those combinators, then the exception is piped to I have yet to understand the weird effects they are speaking about when protecting |
The current spec mandates that the Subscriber cancels the subscription if it fails, which works in a local (non-distributed) context, and it says that failures within the Subscriber shall not be thrown from callbacks (we will probably qualify this to exclude fatal errors that should terminate the VM). Your quote from the Rx docs is probably about catching exceptions thrown from We’ll keep this issue open until after the current changes are done, so that the end result will clarify this point. |
Whatever the "current spec" is I didn't found anything like this in all the versions I looked through :), so we really shouldn't forget about that in the next version. |
You are right, it was only mentioned in discussions so far, thanks for the correction! |
If an There are a few places in RxJava where these are caught and handled, and I imagine any reactive-stream implementation will have to do similar things, otherwise "weird effects" can occur, such as Here are places where error handling for "out of contract" errors are being handled if you're interested:
And there are some exceptions that we want to throw: https://github.com/Netflix/RxJava/blob/master/rxjava-core/src/main/java/rx/exceptions/Exceptions.java#L44 |
In the updated contract (proposed at #55) it has these about errors:
We do not however have anything telling Perhaps we should add something like:
An implementation however will still have to assume exceptions can and will be thrown, because bugs and JVM errors happen. |
The VM can of course throw Proposal:
|
Maybe it should read "fatal exceptions include" rather than "are"? There might be other |
The idea is that this list is exhaustive: in Scala and Akka we have been using (and refining) it for a couple of years now, the status quo is here. VM Errors are those you are referring to. We used to exclude StackOverflowError because intuitively it should be recoverable at a stack frame “higher up”, but it turns out that this error may lead to |
I understand the intent of making an exhaustive list but I go back to my original cross-language/cross-platform argument in #46. We can't know all the kinds of errors considered "fatal" in all possible implementations of Reactive Streams so it seems like we should be giving guidance on what we consider fatal on one specific platform. In JavaScript, an Error is an Error is an Error so it doesn't make a lot of sense to try and segregate errors by type. IMO it's sufficient to use the term "fatal" to mean different things to different implementors with the intent being made clear by the examples of JVM errors given. |
So, more accurately, it might read:
|
|
Ah, of course: what I meant was JVM-specific (and exhaustive in that case), specifications for other platforms will fill this hole in their own fashion. I was assuming JVM for everything we currently do, since we use Java interfaces and language semantics to describe it. |
Those 4 should not be considered exhaustive even for the JVM. StackOverflow is fatal, OOM generally is and different libraries may have others they need to throw. For example, RxJava has OnErrorNotImplemented if the user doesn't provide an error handler and in that case all that can be done is treat it as a fatal error (and user bug). |
SO and OOME are subclasses of VMError, so from that side we got it covered. And LinkageError covers all the usual class loading errors. If it is an error to not implement |
Yup, good point :-) I obviously don't look at the inheritance path for VMError very often!
It is required if someone is implementing the class directly, as it's a method on the interface. However, in Rx where there are multiple ways of handling errors, it's legit to write code like this: source.onErrorResumeNext(function).subscribe(onNextHandler); In that case, if they didn't include the // this will blow up with OnErrorNotImplemented if an onError occurs
source.subscribe(onNextHandler);
// this correctly handles the error
source.subscribe(onNextHandler, errorHandler); Thus, in RxJava it is possible to write code that results in a fatal Also, a library is very tedious and error-prone if every public void onNext(T t) {
try {
doStuff(t);
}catch(Throwable e) {
if(isFatal(e)) {
throw e;
} else {
try {
onError(e);
} catch(Throwable e) {
// all we can do is throw since onError blew up
throw e;
}
}
}
} Thus, for practical reasons libraries will typically want to handle errors when thrown from user code. Overall the library would generally comply with the spec, but any given If an What would you recommend a library implementation do if the |
"We can't specify away bugs like this." We can specify that the behavior is undefined for implementations that violate it. So, "You're free to call System.exit(1), throw new ThreadDeath() and , but you can't expect anything to function properly afterward" |
You didn't answer my question :-) What would you recommend a library implementation do if the |
@benjchristensen I like fail-fast approaches so I'd recommend doing that. What do you think about that? |
A user making a mistake should not cause an entire system to stop functioning properly if it's within the capability of the library to be resilient. I suppose however that a library implementation can handle that without it being part of the spec, but if the spec is draconian and not representing reality it would feel like a library is going against the spec to handle errors.
Agreed. That is why I'm recommending the spec allow use of a fatal exception defined by the library. @jbrisbin's wording keeps the door open for out-of-contract scenarios that need to fast-fail:
In practice, libraries will need to assume exceptions being thrown because users are not going to write bug-free code and we want resilient systems. We could incorporate that reality via additions like this:
|
@alexandru "If an error is thrown from an onNext, onComplete or onSubscribe the Publisher SHOULD propagate it to onError." If the Subscriber has violated the specification by throwing an exception where none is allowed, one cannot assume that the Subscriber is in a consistent state anymore, so the only course of action is to log the throwable (if non-fatal) and either A) exit the Publisher itself or if sensible B) dropping the Subscription to the Subscriber and continue or if fatal, exit the program. |
I disagree, as 'onError' is how users expect to receive errors and most issues are simple bugs like NPEs, but this is not a big deal to debate. Libraries will choose to do what they wish in out-of-contract scenarios. |
Calling onError (again) after onError fails would be a spec violation Thoughts?
|
And that single way is blowing up? What is the prescribed way of blowing up? |
Do we agree that calling onError after onError or onComplete throws a If logging it is the recommendation, and the logging lib throws an My 2c,
|
Another big benefit of this approach is that it is symmetric no matter if
|
Thank you for the clarifications.
I have this world view that exceptions are either return types waiting to be documented in a type (e.g. |
Calling The only time an exception should be thrown is when
I don't like logging as that means we are making a decision on behalf of the user that impacts IO. This kind of logging brings a system to its knees at the very worst time - when things are failing.
No, this is not desirable as we are causing IO on behalf of the user, and a synchronized IO at that. Also, stack traces are not always very helpful in an async system as it can point to one piece of a stream on a thread but the user may have no clue what the final subscriber was, this is why it's so helpful to catch errors and propagate to the final
@viktorklang Can you elaborate on what you mean here?
Exception throwing does not semantically behave the same in these two cases (from the perspective of the user/developer/application) so being symmetric in how errors are handled is not helpful. If it was symmetric in behavior we wouldn't have the
Agreed, but through 2 years of feedback from people filing bugs on Github and my own production experience in handling errors (and losing them) in async streams, leaving this to naturally occur in an async system leads to bad results like hung user requests, infinite loops, silent swallowing of errors etc. Letting exceptions throw in an async system is generally far worse than catching and routing them via the
Because of this, a library implementation will generally want to program defensively and do their best to get exceptions captured and propagated (including across async boundaries) using the For these reasons I recommend we adopt phrasing that states that the contract is to not throw exceptions from the I really think the Reactive Streams spec needs to allow libraries to choose how they will deal with "out of contract" error handling as it is the reality of code that exceptions will be thrown, even if we say they preferably shouldn't be. |
On Tue, May 20, 2014 at 6:07 PM, Ben Christensen
onError, to me, is used to signal when the -upstream- completes in an Also, when the Subscriber methods are async, their exceptions won't be
That assumes that the (production?) logging system that you're using isn't
If people want to, they'll shoot their feet and others too, if they decide
I don't understand, can you elaborate?
If I interpret you correctly; this means that out-of-contract throws do not
Cheers, |
It's not in the spec which is why we're discussing this :-) It does say this though:
In an async system, the
Depends on whether it's synchronous or async (as per agreement in #46). In the async case it would typically be whatever the async scheduler is, but it's up to the implementation. If the
Why would our spec get involved in defining a logging system? We should not dictate that errors require being logged, that is most definitely an implementation detail and something that will be configured differently by everyone.
If an implementation wants to do so, go ahead, just don't put it in the spec and require it.
Of course, all I'm saying is to NOT remove the right for an implementation to provide extra error protection. The specification should not state a finite list of exceptions considered fatal, or that exceptions from
Nope, not at all what I said :-) I said that real-world code will break the contract (sometimes intentionally, but generally due to bugs) and throw exceptions from |
Right after posting I remembered a use case we had in production. An If we had just thrown that NPE and done nothing to handle it, we would have killed our entire system eventually as the We do handle the errors though, and when The system continued behaving just fine and it was a trivial error affecting a small amount of traffic. No resources were tied up as a result of it. Libraries need the flexibility in the spec to do this, or they will be forced to break the spec. |
Answers inline! :) On Thu, May 22, 2014 at 7:49 AM, Ben Christensen
Ah, to me it is extremely important to know if we discuss to understand the
Yes, so lets split this thing up into two parts:
Regarding 1: I think we should spec that the only legal exceptions that a Regarding 2: I think that it is important that broken code gets fixed STAT,
A quick glance reveals that most of the involved parties are already using SLF4J: Reactor: https://github.com/reactor/reactor/blob/master/build.gradle#L28
Cheers, |
The way I read the spec is that if onNext throws an exception (i.e. something that can be handled instead of shutting down the JVM) then this should be treated as the Publisher failing, which mandates that onError is called—so your use-case is covered. If onError or onComplete throw, then the Publisher fails as well, but it cannot call anything further on that Subscriber since we have already called a terminal callback. That leaves onSubscribe, which I would treat exactly like onNext (i.e. Publisher fails, which mandates onError). So as far as I can see we have a definitive set of rules already in place, maybe we just need to clarify that these are the consequences that arise from them? |
@rkuhn What would the prescribed way of dealing with contract violations in On Thu, May 22, 2014 at 1:29 PM, Roland Kuhn notifications@github.comwrote:
Cheers, |
We know what it cannot do: call onError again. Failing (i.e. cleaning itself up by canceling possible upstream Publishers etc.) would certainly be an option, but so would treating this situation as if the Subscriber had canceled its Subscription. It should be up to the implementation to choose any non-violating strategy. |
Ah, now I see a case that we might want to clarify: should exceptions from onNext terminate the Publisher (by failing it) or should they only cancel the Subscription? |
Sorry, I didn't see the last comment when I started replying. Edited this comment. What I have to add now is: Publishers can support multicast, and 'hot' publishers especially want to remain alive for future consumers to subscribe to them. So exceptions from a single Consumer should cause only that one consumer's onError to be called, and the consumer to be unsubscribed. Therefore the spec should not be worded as:
|
On Thu, May 22, 2014 at 3:46 PM, Roland Kuhn notifications@github.comwrote:
If the Publisher is unicast, then canceling the Subscription would Alexandru Nedelcu |
On Sat, May 17, 2014 at 1:11 PM, Johannes Rudolph
I want to ask again this question for clarifications. I think that after an
The spec currently says this:
I think that should be changed to this:
I would go even further and require that for I apologize for redundancy in case this was already addressed or in case I misunderstood the spec. |
I don't think onError should have to be requested. If a subscriber doesn't request more items for some time, the publisher should still be able to propagate the error downstream, if only so the whole stream can fail fast. Could you give a concrete example of a subscriber whose onError handling requires back pressure handling? I think it would be the exception, not the norm. |
@alexandru the request for subscription, completion and/or error is implicit in the act of subscribing. if you want to defer subscription (i.e. hold off for all onX calls, then you just wait with calling subscribe on the publisher). Having to actively read/write in order to get a notification that the "connection" is broken or closed has been the source of many a WTFs for the Sockets library. Does that make sense? |
On Thu, May 22, 2014 at 6:04 PM, Viktor Klang (√) notifications@github.com @alexandru https://github.com/alexandru the request for subscription,
Unfortunately I don’t know about the WTFs you’re talking about, if you can For example, here’s my implementation of the Rx concat operator (i.e. This operator is asynchronous in nature. With back-pressure applied, my I don’t really know the implications right now, maybe I’m worrying about Alexandru Nedelcu |
On Thu, May 22, 2014 at 5:23 PM, Alexandru Nedelcu <notifications@github.com
Google for "Detect closed TCP connection" or "Detect broken TCP connection".
The total buffer needed is in the worst case: 1 + onSubscribe + request(
What do you mean by "acknowledgement"?
Thanks for raising it, I don't know if you have anything to worry about,
Cheers, |
On Thu, May 22, 2014 at 6:39 PM, Viktor Klang (√)
I meant that the publisher in my implementation needs to wait for For regular folks, an explicit mention in the spec that onComplete/onError Alexandru Nedelcu |
I agree it needs to be fixed, which is what we're discussing here. But it does use the semantics of SHOULD NOT. They may be wrong, but it's using them, so what do you mean?
The reality is that there is always broken code. The only way to not have any broken code is for literally every method implementation in the entire codebase to have
Yes that works.
Yes. All that can be done at that point is throw an exception.
Unfortunately when it comes to error handling I think it's more nuanced than this. Errors are generally by definition unexpected. They are exceptions, hence the name Thus, a I don't want to encourage throwing exceptions, as when things are async it's always possible to lose one, but it can't be a hard-and-fast MUST NOT rule that results in the system breaking if an exception is thrown.
Definitely not worth shutting down the JVM.
I agree with this and it solves most problems.
I would modify this slightly to allow calling
Agreed.
I think it should just result in
No, I think |
FYI. The WTF that arises from TCP half-open connections for most people results from a basic (mis)understanding of distributed systems and the history of responsibilities with TCP. Historically, TCP does not take application liveness into account from the spec perspective, but BCP is that implementations provide keepalive semantics to apps. This leads to most application protocols having to be specified without TCP keepalive. I mention this for clarification. I think @viktorklang is correct. And if this is assumed, has some implications on a control protocol design. |
On Thu, May 22, 2014 at 6:02 PM, Ben Christensen
As a user, you have a choice: a) You take responsibility for implementing the spec and create your own def safeSubscriber[T](us: Subscriber[T], logger: Logger): Subscriber[T] = override def onNext(element: T) = try us.onNext(element) catch { override def onError(throwable: Throwable) = try us.onError(throwable) … and so on and so forth
In any case, I don't think the exception should ever be put "raw" into The benefits are:
Cheers, |
Great elaboration, Todd! On Thu, May 22, 2014 at 6:55 PM, Todd L. Montgomery <
Cheers, |
Move to close. @reactive-streams/contributors |
👍 for closing. Error handling will be defined in the spec adequately I think with the change in #68 |
Unfortunately the specification is not up-to-date, I haven't been following the whole discussion and the examples given are incomplete, so in trying to implement simple publishers / subscribers, the following questions came to mind:
onComplete
/onError
without being asked ... as in, when the Subscriber doesSubscription.request(n)
, does thisn
count include the ensuingonComplete
/onError
?Example:
onNext(elem)
fails? Should the publisher's logic catch it and trigger anonError(ex)
event? In the Rx design guidelines, this is said to trigger weird effects.The text was updated successfully, but these errors were encountered: