Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add core coordination algorithm for cluster state publishing #32171

Merged
merged 21 commits into from
Jul 20, 2018

Conversation

ywelsch
Copy link
Contributor

@ywelsch ywelsch commented Jul 18, 2018

Adds the core coordination algorithm that guarantees safety for cluster state publishing. This is a straight-forward port of the transition rules from the [formal model](url
https://github.com/elastic/elasticsearch-formal-models/blob/master/ZenWithTerms/tla/ZenWithTerms.tla) and goes hand-in-hand with the PR elastic/elasticsearch-formal-models#37.

Relates to #32006

@ywelsch ywelsch added >enhancement v7.0.0 :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. labels Jul 18, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be useful to have javadoc comments on the classes introduced here to give a bit of intuition for what they mean. Particularly Join which is subtly different from simply attempting to join a cluster, and CoordinationStateException which is a lot less scary (and more common) than it looks.

I did some manual fuzzing of the logic and found a handful of cases that aren't covered by tests - see inline comments.


import java.io.IOException;

public class ApplyCommit extends TermVersionRequest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Classes derived from TransportRequest seem to be named with a ...Request suffix. ApplyCommitRequest here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, fixed in 34567f6


public class CoordinationStateTests extends ESTestCase {

DiscoveryNode node1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These fields can all be private. Additionally, initialStateNode{2,3} and cs3 are only used in setupNodes().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 6611e58 (cs3 is now used)

import static org.hamcrest.Matchers.containsString;
import static org.hamcrest.Matchers.equalTo;

public class CoordinationStateTests extends ESTestCase {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a test that runs the state machines in parallel, passing messages back and forth, and asserting that the committed states remain consistent. This might want to wait until the PR on which I'm working which introduces a deterministic task queue, so maybe just a //TODO for now.

* @throws CoordinationStateRejectedException if the arguments were incompatible with the current state of this object.
*/
public Join handleStartJoin(StartJoinRequest startJoinRequest) {
if (startJoinRequest.getTerm() <= getCurrentTerm()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests all pass if this is < instead of <=.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you run the tests multiple times? Both testStartJoinBeforeBootstrap and testStartJoinAfterBootstrap fail approx. 30% of the times if this is changed from <= to <.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a couple of times, not enough clearly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed 3bec606 🗡

logger.debug("handleJoin: added join {} from [{}] for election, electionWon={} lastAcceptedTerm={} lastAcceptedVersion={}", join,
join.getSourceNode(), electionWon, lastAcceptedTerm, getLastAcceptedVersion());

if (electionWon && prevElectionWon == false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests all pass with if (electionWon) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 5119256 (which also found another small copy-paste error)

"incoming term " + join.getTerm() + " does not match current term " + getCurrentTerm());
}

if (startedJoinSinceLastReboot == false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests pass with if (false) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 8b7af72

}

if (clusterState.getLastAcceptedConfiguration().equals(getLastAcceptedConfiguration()) == false
&& getLastCommittedConfiguration().equals(getLastAcceptedConfiguration()) == false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests pass with if (getLastCommittedConfiguration().equals(getLastAcceptedConfiguration()) == false) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 708ae8f

throw new CoordinationStateRejectedException("incoming term " + clusterState.term() + " does not match current term " +
getCurrentTerm());
}
if (clusterState.term() == getLastAcceptedTerm() && clusterState.version() <= getLastAcceptedVersion()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests pass with:

  • if (clusterState.version() <= getLastAcceptedVersion()) {
  • if (clusterState.version() < getLastAcceptedVersion()) {
  • if (clusterState.term() == getLastAcceptedTerm()) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 0d84428

}

public boolean isElectionQuorum(VoteCollection votes) {
return votes.isQuorum(getLastCommittedConfiguration()) && votes.isQuorum(getLastAcceptedConfiguration());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests pass with

  • return votes.isQuorum(getLastCommittedConfiguration());
  • return votes.isQuorum(getLastAcceptedConfiguration());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in ab02587

}

public boolean isPublishQuorum(VoteCollection votes) {
return votes.isQuorum(getLastCommittedConfiguration()) && votes.isQuorum(lastPublishedConfiguration);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests pass with

  • return votes.isQuorum(getLastCommittedConfiguration());
  • return votes.isQuorum(lastPublishedConfiguration);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 8ee68b9

@ywelsch
Copy link
Contributor Author

ywelsch commented Jul 19, 2018

@DaveCTurner Thanks for the great review. I've run a code coverage tool on this which found one more missing test case, fixed in 9b2ad94. I have addressed all but one comment (the randomized testing), but I think this is ready for another review. I will check in the meanwhile how tricky it is to add these tests.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, LGTM. I found a nit in a javadoc comment. Let's not add the extra tests in this PR.

/**
* Triggered by a {@link StartJoinRequest}, instances of this class represent join votes,
* and have a source and target node. The source node is the node that provides the vote,
* and the target node is the node for which this vote is casted. A node will only cast
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: casted should read cast.

@ywelsch ywelsch merged commit 384cc54 into elastic:zen2 Jul 20, 2018
ywelsch added a commit to elastic/elasticsearch-formal-models that referenced this pull request Jul 20, 2018
Moving code around and renaming things. Goes hand-in-hand with a PR in the Elasticsearch repository that introduces the corresponding Java classes: elastic/elasticsearch#32171
ywelsch added a commit that referenced this pull request Aug 7, 2018
Simulates a random run of a cluster with multiple CoordinationState instances (each representing
one node), passing messages back and forth, and asserting that the overall system satisfies a given
set of safety properties.

Follow-up to #32171
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >enhancement v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants