Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cover the allowed call sequences on Subscriber in a spec rule #202

Closed
rkuhn opened this issue Jan 16, 2015 · 71 comments · Fixed by #212
Closed

cover the allowed call sequences on Subscriber in a spec rule #202

rkuhn opened this issue Jan 16, 2015 · 71 comments · Fixed by #212
Assignees
Milestone

Comments

@rkuhn
Copy link
Member

rkuhn commented Jan 16, 2015

Currently it is only implicit that legal sequences start with onError or onSubscribe and end with onComplete or onError (or not at all), and it is not regulated within the spec rules that onComplete is illegal before onSubscribe.

I see two ways to approach this:

  • we leave the protocol sequence definition outside of the spec and add the missing rule about onComplete
  • we make the sequence definition explicit within the rules

The second approach would lead to duplication, since legal sequence grammars are almost perfectly constrained already, so I am leaning towards adding a new rule that orders onComplete and onSubscribe.

@drewhk
Copy link
Contributor

drewhk commented Jan 16, 2015

onComplete is currently legal before onSubscribe. I don't know how much you would gain by ordering them, given that onError can come before onSubscribe anyway. In my experience the only real help to implementors would be if onSubscribe is always the first message, even if it is immediately followed by onComplete or onError. Less state variations to maintain. I am not sure we want that though.

@rkuhn
Copy link
Member Author

rkuhn commented Jan 16, 2015

According to the docs the legal status of “onComplete first” is at least dubious, and #199 introduces a test that fails in this case, so we should definitely fix this either way.

@ktoso
Copy link
Contributor

ktoso commented Jan 16, 2015

The spec (the rules) does not mandate if onComplete can come first or not currently, as mentioned in #199 and for more details see this comment

It is a bit accidental that current behaviour is "onComplete can not come first", as such is the diagram in section https://github.com/reactive-streams/reactive-streams#api-components which isn't really a spec rule.

@drewhk
Copy link
Contributor

drewhk commented Jan 16, 2015

In the Akka impl we definitely treated onComplete as valid response instead of onSubscribe, in fact we have testkit methods like: expectCompletedOrSubscriptionFollowedByComplete()

@drewhk
Copy link
Contributor

drewhk commented Jan 16, 2015

So the current language accepted by Akka Subscribers is:

(onSubscribe ~ (onNext)*)? ~ (onError | onComplete)

(Kleene star here allows infinite words)

@viktorklang
Copy link
Contributor

@drewhk onError definitely needs to be able to be sent before onSubscribe if the Publisher is unable to, for some reason, create a Subscription.

@drewhk
Copy link
Contributor

drewhk commented Jan 16, 2015

Well, you can always send a dummy subscription, but in general I agree. I don't see though then why cannot we send an onComplete instead of an onSubscribe. Since we need to handle the onError case the onComplete case is not that much to add (from an implementors perspective).

@viktorklang
Copy link
Contributor

@rkuhn I agree with "so I am leaning towards adding a new rule that orders onComplete and onSubscribe.".

The question is if the symmetry of being able to send either onError or onComplete before onSubscribe has merit and should be instated (I'd think not, since it is ultimately racy for hot publishers)

@drewhk
Copy link
Contributor

drewhk commented Jan 16, 2015

@viktorklang so you propose the following language?

(onSubscribe ~ (onNext)* ~ (onError | onComplete)) | onError

@viktorklang
Copy link
Contributor

@drewhk Since it is verboten by the spec to call any methods on the Subscription from within onError or onComplete there should not be any technical issues with allowing both onError and onComplete without a preceeding onSubscribe.

Current semantics as present in the README.md: onError | (onSubscribe onNext* (onError | onComplete)?)

@viktorklang
Copy link
Contributor

@reactive-streams/contributors Thoughts?

@drewhk
Copy link
Contributor

drewhk commented Jan 16, 2015

My point (experienced in Akka Streams) is that if you have to handle onError without onSubscribe then it is not that hard to add onComplete handling there, too. I personally prefer what we use now:

(onSubscribe ~ (onNext)*)? ~ (onError | onComplete)

Not because I don't want to rewrite it, but because it makes sense. The simplest implementation wise of course would be:

onSubscribe ~ (onNext)* ~ (onError | onComplete)

but that would mean that the Publisher must send dummy Subscriptions even when it knows that it is already empty or in error state.

Edited

@viktorklang
Copy link
Contributor

@drewhk Given 2.3 (Subscriber.onComplete() and Subscriber.onError(Throwable t) MUST NOT call any methods on the Subscription or the Publisher) I think changing to (onSubscribe ~ (onNext)*)? ~ (onError | onComplete) would be fine, but it IS a last-minute, non-trivial spec change for what we know right now so I definitely think we need to make sure we get a majority vote if we want/need to change it.

@drewhk
Copy link
Contributor

drewhk commented Jan 16, 2015

What rule does exclude now sending an onComplete instead of a Subscription? Because if there is such rule I am pretty sure we have some impls that violate it. i.e. I am not sure if that rule is tested then by the TCK, or we had classes that has been not verified.

@viktorklang
Copy link
Contributor

@drewhk This Issue is about making that clearer in the spec, and then of course make sure that the TCK properly verifies it, then we need to make sure that all Akka impls are properly TCKd :)

@drewhk
Copy link
Contributor

drewhk commented Jan 16, 2015

Maybe it helps, I created a regex like (non-complete) spec of the language that can be seen by a man-in-the-middle observer that orders concurrent events arbitrarily:

// ** means infinte long string allowed
// * means normal Kleene star

subscribe ~ earlyTermination | activeSubscription
earlyTermination := (onComplete | onError) // or onError only?
activeSubscription := (onSubscribe ~ conversation ~ (cancellation | termination))

conversation := (request* ~ onNext*)** 
cancellation := cancel ~ producerRunOver
termination := (onComplete | onError) ~ consumerRunOver

// due to concurrency some stray messages are allowed
producerRunOver := onNext* ~ (onComplete | onError)?
consumerRunOver := request*

// unfolded:
subscribe ~ (onComplete | onError) | (onSubscribe ~ (request* ~ onNext*)** ~ ((cancel ~ onNext* ~ (onComplete | onError)?) | ((onComplete | onError) ~ request*)))

(Maybe it would be more digestable using a drawing. Beware of bugs.)

@viktorklang
Copy link
Contributor

Status here?

@viktorklang
Copy link
Contributor

Ping @rkuhn,
if we want to address this before 1.0.0.final we need to get it into the next RC

@benjchristensen
Copy link
Contributor

I think I'm okay with onComplete being called immediately. If the data source is empty is there a reason to require this path -> onSubscribe/request/onComplete as opposed to just called onComplete directly the same way onError can be?

In other words, a Publisher should be able to call onSubscribe, onError or onComplete, but can only call onNext after onSubscribe/request.

@viktorklang
Copy link
Contributor

@benjchristensen I think that is right. Given 2.3:

Subscriber.onComplete() and Subscriber.onError(Throwable t) MUST NOT call any methods on the Subscription or the Publisher.

So it doesn't matter if onSubscribe has been called before onError or onComplete since they shouldn't mess around with the Subscription anyway.

The question though is how we make it clear in the spec that it is:

(onSubscribe ~ onNext*)? ~ (onError | onComplete)

@rkuhn & @drewhk @DougLea Wdyt?

@DougLea
Copy link
Contributor

DougLea commented Feb 5, 2015

My initial reading of this (and how I implemented) is that a subscription cannot be "complete" if it never began, so I force onSubscribe before onComplete when publisher is already closed. I still think this is a good policy, but I now don't see any wording forcing this.

@viktorklang
Copy link
Contributor

@DougLea Your intuition is right. The existing intent is to only allow onComplete after an onSubscribe.

So there are 2 questions here:

  1. If we keep the existing intent, how do we change the spec to make that clear
  2. If we should change so that onComplete is symmetric to onError in that it can be sent without an onSubscribe, how do we change the spec to make this clear

I'm all ears on solutions to this :)

@drewhk
Copy link
Contributor

drewhk commented Feb 5, 2015

I prefer onComplete and onError to be symmetric, they can either come without onSubscribe, or they must come after onSubscribe. (onError can be made to always come after onSubscribe, since it is always possible to send a dummy subscription)

@benjchristensen
Copy link
Contributor

I prefer onComplete and onError to be symmetric,

I like this, and prefer not having to send dummy subscriptions.

@viktorklang
Copy link
Contributor

@benjchristensen Great, so you're in the (onSubscribe ~ onNext*)? ~ (onError | onComplete)-camp with me then :)

@rkuhn
Copy link
Member Author

rkuhn commented Feb 6, 2015

Count me in that camp as well—with the small pedantic fix of adding a final ? because otherwise never-ending Publishers would strictly speaking not match the grammar that is expressed (not that it makes a huge difference).

@viktorklang
Copy link
Contributor

@rkuhn Never-ending Publishers would still send an onSubscribe, though, because otherwise there is no association happening?

@viktorklang
Copy link
Contributor

@DougLea I can understand this point (always requiring onSubscribe) and the cost of propagating a "dummy" subscription is small indeed. Perhaps this is a case where simplicity should win over performance.

@rkuhn
Copy link
Member Author

rkuhn commented Feb 6, 2015

If we go down the route of the simplest possible grammar (as Doug proposes) then we should also include a DummySubscription in the reactive streams artifact because that will then be needed in many cases:

final public class DummySubscription implements Subscription {
  public static final Subscription instance = new DummySubscription;

  @Override public void request(long n) {
    if (n <= 0) throw new IllegalArgumentException("...");
  }
  @Override public void cancel() {}
}

@viktorklang
Copy link
Contributor

if (n <= 0) throw new IllegalArgumentException("...");

is not legal though.

@DougLea
Copy link
Contributor

DougLea commented Feb 6, 2015

@viktorklang I'd be surprised if there is even a performance advantage -- in the simpler version, most clients need fewer special-case checks that would only rarely trigger.

@drewhk
Copy link
Contributor

drewhk commented Feb 6, 2015

I don't think I can add anything more to the discussion, so I summarize my opinion (a.k.a vote) and leave it to the others to decide:

  • I strongly prefer onError and onComplete to be symmetric
  • I slightly prefer onSubscribe required to be the first, but no strong preference here

@rkuhn
Copy link
Member Author

rkuhn commented Feb 6, 2015

d’oh, of course; signaling that exception correctly would mean allocating a DummySubscription for each such subscription :-(

@drewhk
Copy link
Contributor

drewhk commented Feb 6, 2015

You don't need to signal anything, the relation is already "terminated" it is just being in a race with the faulty request, but you can always arbitrarily define the order and say that the onComplete or onError that was already scheduled "won" (by definition, not by reality).

@rkuhn
Copy link
Member Author

rkuhn commented Feb 6, 2015

good point, thanks; so the corrected code is

final public class AlreadyCompletedSubscription implements Subscription {
  public static final Subscription instance = new AlreadyCompletedSubscription;

  @Override public void request(long n) {}
  @Override public void cancel() {}
}

@benjchristensen
Copy link
Contributor

we should also include a DummySubscription in the reactive streams artifact

I don't like adding things like this. I strongly prefer keeping it as just interface definitions.

prefer onError and onComplete to be symmetric

I also prefer this.

(onSubscribe ~ onNext*)? ~ (onError | onComplete)?

The simplicity of this for me is that onError/onComplete terminal events can be sent whenever, but onNext must always be preceded by onSubscribe since onNext must obey the request behavior of the Subscription.

@benjchristensen
Copy link
Contributor

Is there a link that describes the protocol syntax/grammar so we can link to it like we do to https://www.ietf.org/rfc/rfc2119.txt in the README?

@viktorklang
Copy link
Contributor

@benjchristensen

I, too, would like the simplicity along the lines of: (onSubscribe ~ onNext*)? ~ (onError | onComplete)?
To me the open question RE that definition is valid:

class P[T] extends Publisher[T] { override def subscribe(s: Subscriber[_ >: T]) = () }

(I'd guess we'd have to have a clause in the rules that'd prevent it from being legal)
And an open question is if it matters?

So, I guess my stance right now is:

I don't like the status quo, I think it should be symmetric w.r.t onError and onComplete. I'm OK with requiring to pass in a Subscription that is already cancelled into onSubscribe but it seems like a code smell to me so I tend to lean a bit towards making onComplete be signallable before onSubscribe.

I think I need to experiment with the example implementations to see what makes implementations more or less ugly.

@benjchristensen
Copy link
Contributor

I don't think that should be prevented. In fact, RxJava legitimately has a never() factory method that creates an Observable that never does anything. It actually does have some usecases, such as doing nothing between user event sequences in a switchLatest that switches between streams.

Most of the time a stream that never does anything is undesirable, but it's not illegal. And practically what's the difference between a stream that never emits and one that will emit in 4 hours if the consumer wanted it in 100ms? Async consumers needs to choose to protect themselves by stating their assumptions with timeouts and/or consumption limits (like take(n)) if they do not know what the source stream can provide to them.

@viktorklang
Copy link
Contributor

@benjchristensen Now that is a compelling argument!

@drewhk
Copy link
Contributor

drewhk commented Feb 6, 2015

Hm, I have to disagree here. I agree that a stream that never does anything is sometimes useful, but that has nothing to do with onSubscribe being required or not per se. As an analogy, a TCP connection is also completely fine doing nothing, but the three-way handshake is still required at the beginning. I don't say that sending onError/onComplete any time has no merit (this is what we implemented anyway, so it is even less work), all I want to say that you can have streams that do nothing while still establishing a proper subscribe-onSubscribe handshake. There is something satisfying about that a "never" like operation is not implicit but explicit by having a clear handshake that proves the linkage between the "never" element and its downstream. I don't have a stong opinion though.

@viktorklang
Copy link
Contributor

Alright, so I think we all agree that the current asymmetric definition should be changed.

I'll try to find time to experiment with the impact of either of the suggestions (onSubscribe always && onComplete without preceding onSubscribe) on the example Publisher, I think that would probably convince myself what direction I'll vote.

@viktorklang
Copy link
Contributor

So the choice is between:

(onSubscribe ~ onNext*)? ~ (onError | onComplete)?

and

onSubscribe ~ onNext* ~ (onError | onComplete)?

@benjchristensen
Copy link
Contributor

all I want to say that you can have streams that do nothing while still establishing a proper subscribe-onSubscribe handshake

I'm okay with that. We just shouldn't declare it illegal to never emit onNext/onError/onComplete. So we would say that one MUST either emit onSubscribe or onError/onComplete.

@rkuhn
Copy link
Member Author

rkuhn commented Feb 6, 2015

Agreed on allowing “silent” Publishers; the remaining question then is whether onSubscribe should be mandatory or whether “one of onSubscribe/onError/onComplete” should be mandatory. While implementing the spec as well as while writing our own tests we encountered extra effort due to the uncertainty of what the first invocation will be, requiring an initial onSubscribe would make the logic more regular. Just as Endre I have a preference for the second choice presented by Viktor an hour ago (read: Endre’s argument convinced me).

@benjchristensen
Copy link
Contributor

I like “one of onSubscribe/onError/onComplete should be mandatory” but both can work.

@rkuhn
Copy link
Member Author

rkuhn commented Feb 6, 2015

@tmontgomery are there other considerations that we have not yet included?

@smaldini
Copy link
Contributor

smaldini commented Feb 8, 2015

Agree with Roland, implementations are slightly confused by this. Mandatory OnSubscribe is slightly more verbose but we can deal more efficiently with this by providing wrappers in Publisher factories.
+1 on mandatory onSubscribe, but I'm biased having experimenting a few issues with this.

Sent from my iPhone

On 6 Feb 2015, at 9:05 pm, Roland Kuhn notifications@github.com wrote:

@tmontgomery are there other considerations that we have not yet included?


Reply to this email directly or view it on GitHub.

@benjchristensen
Copy link
Contributor

extra effort due to the uncertainty of what the first invocation will be, requiring an initial onSubscribe would make the logic more regular

I don't understand the extra effort for handling an onComplete terminal event as onError must already be handled.

@viktorklang
Copy link
Contributor

My take on it:

Having onSusbcribe always come first is an invariant which is simpler to
encode as onComplete and onError will then only ever be emitted after it
(already having extra code to deal with 'early errors' is extra code). Any
publisher that isn't permanently failed or complete will have to pass down
Subscriptions of its own, so it is only an optimization to pass a
nop-subscription if already known to be failed/complete.

For the permanently failed and completed publishers the cost of the
overhead is more significant (at most 50%) but given up to hundreds of
millions of signals per second the impact will be hard to notice.

Spec-wise I found it simpler and more straightforward to amend in the
proposed direction, and (I suspect) it will be easier for implementers to
follow.

I think it will be an improvement over what's currently in master, and any
implementation bridge doing early onError or onComplete can always make
sure that a nop-subscription is passed to onSubscribe before issuing the
early onError/onComplete.
On 10 Feb 2015 23:58, "Ben Christensen" notifications@github.com wrote:

extra effort due to the uncertainty of what the first invocation will be,
requiring an initial onSubscribe would make the logic more regular

I don't understand the extra effort for handling an onComplete terminal
event as onError must already be handled.


Reply to this email directly or view it on GitHub
#202 (comment)
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants