Skip to content

Conversation

@tolbertam
Copy link
Contributor

For JAVA-1388.

Based on #1064 for now until that is merged.

I followed the approaches in #1000 where we query the peers_v2 table first, and if that fails, downgrade to the peers table and never query v2 again. I just realized we should probably be more selective about downgrading based on the error returned by that (both here and on 4.x), as its reasonable to expect we can get other errors (like client timeouts) that don't necessitate downgrading to the peers table.

I updated ScassandraCluster to support multiple nodes per IP and added a test for this, and it seems to work well. It would be nice to have tests against real cassandra nodes, which will require CCM changes, which may take me a little bit. Also needs tests for the broadcast address changing / query of peers table for a single host that we currently have in ControlConnectionTest for the existing peers table.

@tolbertam tolbertam added this to the 3.6.0 milestone Jul 29, 2018
@tolbertam tolbertam requested review from GregBestland and olim7t July 29, 2018 02:16
* @return result of peers query.
*/
private ListenableFuture<ResultSet> selectPeersFuture(final Connection connection) {
if (isPeersV2) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For < C* 4.x, maybe we should just not even try the peers_v2 table. We could only allow this for protocol V5+. I think that's an ok limitation, and we can add some kind of way to bypass the V5 requirement for our Scassandra tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this more. I think for 3.x, this should be more of an opt-in thing via a builder option. For a minor version making an extra query on initialization seems kinda iffy, and I think it's ok to require use of this feature to be explicit, since the 3.x driver is more driven around ip addresses and global setting of port (via withPort), so we should probably make a separate option so users can opt-in to this capability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this more and decided that it's ok to try peers_v2 table first.

* @return result of peers query.
*/
private ListenableFuture<ResultSet> selectPeersFuture(final Connection connection) {
String query =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very explicit now, the peers_v2 is only ever queried if allowHostPortDiscovery is used, and if the query fails, cluster initialization fails.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to think about here is whether or not we should fall back to the old system peers query if we get an invalid query request back. I think in that specific case it might be sane to fall back on old peers query.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've wavered back and forth on this, my current line of thinking is if the user opts in using allowHostPortDiscovery, they are intending to depend on the presence of the peers_v2 table to tell them what ports to use. In absence of that table, I think failing is the right thing to do. Whereas with java driver 4.x, we always check this table first, so in that case it makes sense to downgrade to the peers table.


/**
* Enables host port discovery using the system.peers_v2 table added in Cassandra 4.0 (via
* CASSANDRA-7544). This enables running multiple Cassandra
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the last sentence is truncated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woops, might have accidentally triggered a vim hotkey 😆 will fix.

*
* @return this builder.
*/
public Builder allowHostPortDiscovery() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why expose this? Can't we simply detect the presence of the new peers_v2 table? Or is it because if the cluster has some nodes with peers_v2 and others without, that might be a too complex situation to handle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explain my rationale for this here: #1065 (comment). I originally tried peers_v2 first, then peers if the table doesn't exist (like 4.x does). But thinking about it more I wasn't sure if adding another query in a minor version was the right thing to do, so I thought maybe making this an Opt-in feature on 3.x would be the right way to go.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't sound that bad to add that extra query, since it's done only once at the beginning, and if that fails then we always go directly to the legacy table.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are we doing the fallback to system.peers at all anymore? The PR description says so...

Copy link
Contributor Author

@tolbertam tolbertam Aug 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was erring on the side of caution by making this an 'opt-in' feature. If you don't see that as too risky, I will revert it back to my initial implementation which mirrors the approach we took with 4.x. With regards to not falling back to system.peers, my thought process was that if you explicitly opted in using allowHostPortDiscovery(), it didn't seem the backing off to peers was the best idea, since you are expressing that you want to utilize the capability which is not being provided by the server. If i revert to not making this opt-in that point is moot though.

InetSocketAddress broadcastAddress = null;
if (row.getColumnDefinitions().contains("peer")) { // system.peers
broadcastAddress = row.getInet("peer");
int broadcastPort =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of zero, shouldn't we use the default port number (9042)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for the broadcast/listen address which is typically 7000. I felt that since we can't know for sure what it's configured to, using 0 would be better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, of course. 👍

* href="https://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configCassandra_yaml_r.html">The
* cassandra.yaml configuration file</a>
*/
public InetAddress getBroadcastAddress() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is now used only in tests. Are you keeping it for binary compatibility? In this case we might want to mark it @Deprecated just as you did for getListenAddress().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally marked these methods deprecated, but then I backed off when I realized for most users the InetSocketAddress version will return port 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense but then for consistency maybe we should not mark getListenAddress() as deprecated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh! Didn't realize that, I'll make them consistent, thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should at least explain that it duplicates information already present in the other method.

* Returns the node listen address (that is, the IP the node uses to contact other peers in the
* cluster), if known.
* @return the node listen address, if known. Otherwise {@code null}.
* @deprecated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to explain why it is deprecated ("Use #getListenSocketAddress() instead").

Copy link
Contributor

@GregBestland GregBestland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 looks good one Just one small thing to consider. Merge at your discretion

* @return result of peers query.
*/
private ListenableFuture<ResultSet> selectPeersFuture(final Connection connection) {
String query =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to think about here is whether or not we should fall back to the old system peers query if we get an invalid query request back. I think in that specific case it might be sane to fall back on old peers query.

@tolbertam
Copy link
Contributor Author

Tests surfaced some very minor issues (log message contents, etc.) that i need to work out. will do.

@tolbertam
Copy link
Contributor Author

rebased and fixed/enhanced tests.

@tolbertam tolbertam changed the base branch from java1786 to 3.x August 6, 2018 18:19
@tolbertam
Copy link
Contributor Author

Rebased on 3.x

Copy link
Contributor

@olim7t olim7t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really convinced we should make this opt-in. If you connect to a Cassandra 4 cluster with the driver, things should just work.

*
* @return this builder.
*/
public Builder allowHostPortDiscovery() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't sound that bad to add that extra query, since it's done only once at the beginning, and if that fails then we always go directly to the legacy table.

+ "(this can happen if the broadcast address changed)",
host.getBroadcastAddress(),
address,
host);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit lost here: why do we read broadcastSocketAddress again?
addressToUse is never used.
address is only used in the log message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is an artifact of some refactoring, i'll fix this.

@@ -594,20 +614,26 @@ private static void updateInfo(
// After CASSANDRA-9603 (2.0.17, 2.1.8, 2.2.0 rc2) local row contains one more column:
// - listen_address
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe update this kind of comments to explain the new rows in peers_v2 (or remove them / have a single comment detailing the table formats in different versions?).

*
* @return this builder.
*/
public Builder allowHostPortDiscovery() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are we doing the fallback to system.peers at all anymore? The PR description says so...

// that's the 'peer' in the 'System.peers' table and avoids querying the full peers table in
// ControlConnection.refreshNodeInfo.
private volatile InetAddress broadcastAddress;
private volatile InetSocketAddress broadcastAddress;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename this and the corresponding setter with Socket?
Right now we have:

  • getBroadcastAddress returning an InetAddress
  • setBroadcastAddress taking an InetSocketAddress

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rename the setters to *SocketAddress for consistency.

* href="https://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configCassandra_yaml_r.html">The
* cassandra.yaml configuration file</a>
*/
public InetAddress getBroadcastAddress() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should at least explain that it duplicates information already present in the other method.

…ry fails.

* Remove deprecated annotation from Host.getListenAddress
* Fix javadoc for allowHostPortDiscovery
* Throw a specialized error indicating that Cluster.Builder.allowHostPortDiscovery
  requires peers_v2.
* Switch back to the approach where peers_v2 is tried and fall back
  to system.peers if it doesn't exist instead of making opt-in
  via allowHostPortDiscovery.
* rename set*Address methods to set*SocketAddress.
* update comments as suggested.
@tolbertam
Copy link
Contributor Author

Rebased on 3.x and processed feedback, please let me know if I missed anything. The behavior is now 1:1 with 4.x. The peers_v2 table will be queried initially, and if we get an InvalidQueryException, we fall back to the peers table and logic.

: SELECT_PEERS
+ " WHERE peer='"
+ broadcastAddress.getAddress().getHostAddress()
+ "'";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't handle a missing peers_v2 here, but that's fine: we do a refreshNodeListAndTokenMap every time the control connection (re)connects, and it happens before the new connection is published in connectionRef, so we know it will never race with this method.
So isPeersV2 will always be set correctly at this point. Even in a mixed 3.x/4.0 cluster, if we initially connected to a 4.0 node, isPeersV2 will downgrade correctly if we reconnect to a legacy node.

@tolbertam tolbertam merged commit 598c5cb into 3.x Aug 15, 2018
@tolbertam tolbertam deleted the java1388 branch August 16, 2018 00:24
michaelsembwever added a commit to thelastpickle/cassandra that referenced this pull request Sep 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants