Skip to content

Commit 5d80a46

Browse files
committed
Document pooling
1 parent 2db67b8 commit 5d80a46

File tree

2 files changed

+151
-1
lines changed

2 files changed

+151
-1
lines changed

core/src/main/resources/reference.conf

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,7 @@ datastax-java-driver {
173173
#
174174
# This option can be changed at runtime, the new value will be used for new connections created
175175
# after the change.
176-
max-requests-per-connection = 32768
176+
max-requests-per-connection = 1024
177177

178178
# The maximum number of "orphaned" requests before a connection gets closed automatically.
179179
#

manual/core/pooling/README.md

Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
## Connection pooling
2+
3+
### Basics
4+
5+
The driver communicates with Cassandra over TCP, using the Cassandra binary protocol. This protocol
6+
is asynchronous, which allows each TCP connection to handle multiple simultaneous requests:
7+
8+
* when a query gets executed, a *stream id* gets assigned to it. It is a unique identifier on the
9+
current connection;
10+
* the driver writes a request containing the stream id and the query on the connection, and then
11+
proceeds without waiting for the response (if you're using the asynchronous API, this is when the
12+
driver will send you back a `java.util.concurrent.CompletionStage`). Once the request has been
13+
written to the connection, we say that it is *in flight*;
14+
* at some point, Cassandra will send back a response on the connection. This response also contains
15+
the stream id, which allows the driver to trigger a callback that will complete the corresponding
16+
query (this is the point where your `CompletionStage` will get completed).
17+
18+
You don't need to manage connections yourself. You simply interact with a [CqlSession] object, which
19+
takes care of it.
20+
21+
**For a given session, there is one connection pool per connected node** (a node is connected when
22+
it is up and not ignored by the [load balancing policy](../load_balancing/)).
23+
24+
The number of connections per pool is configurable (this will be described in the next section).
25+
There are up to 32768 stream ids per connection.
26+
27+
```ditaa
28+
+-------+1 n+----+1 n+----------+1 32K+-------+
29+
+Session+-------+Pool+-------+Connection+-------+Request+
30+
+-------+ +----+ +----------+ +-------+
31+
```
32+
33+
### Configuration
34+
35+
Pool sizes are defined in the `connection` section of the [configuration](../configuration/). Here
36+
are the relevant options with their default values:
37+
38+
```
39+
datastax-java-driver.connection {
40+
max-requests-per-connection = 1024
41+
pool {
42+
local.size = 1
43+
remote.size = 1
44+
}
45+
}
46+
```
47+
48+
Unlike previous versions of the driver, pools do not resize dynamically. However you can adjust the
49+
options at runtime, the driver will detect and apply the changes.
50+
51+
#### Heartbeat
52+
53+
If connections stay idle for too long, they might be dropped by intermediate network devices
54+
(routers, firewalls...). Normally, TCP keepalive should take care of this; but tweaking low-level
55+
keepalive settings might be impractical in some environments.
56+
57+
The driver provides application-side keepalive in the form of a connection heartbeat: when a
58+
connection does receive incoming reads for a given amount of time, the driver will simulate activity
59+
by writing a dummy request to it. If that request fails, the connection is trashed and replaced.
60+
61+
This feature is enabled by default. Here are the default values in the configuration:
62+
63+
```
64+
datastax-java-driver.connection {
65+
heartbeat {
66+
interval = 30 seconds
67+
68+
# How long the driver waits for the response to a heartbeat. If this timeout fires, the heartbeat
69+
# is considered failed.
70+
timeout = 500 milliseconds
71+
}
72+
}
73+
```
74+
75+
Both options can be changed at runtime, the new value will be used for new connections created after
76+
the change.
77+
78+
### Monitoring
79+
80+
The driver exposes node-level [metrics](../metrics/) to monitor your pools (note that all metrics
81+
are disabled by default, you'll need to change your configuration to enable them):
82+
83+
```
84+
datastax-java-driver {
85+
metrics.node.enabled = [
86+
# The number of connections open to this node for regular requests (exposed as a
87+
# Gauge<Integer>).
88+
#
89+
# This includes the control connection (which uses at most one extra connection to a random
90+
# node in the cluster).
91+
pool.open-connections,
92+
93+
# The number of stream ids available on the connections to this node (exposed as a
94+
# Gauge<Integer>).
95+
#
96+
# Stream ids are used to multiplex requests on each connection, so this is an indication of
97+
# how many more requests the node could handle concurrently before becoming saturated (note
98+
# that this is a driver-side only consideration, there might be other limitations on the
99+
# server that prevent reaching that theoretical limit).
100+
pool.available-streams,
101+
102+
# The number of requests currently executing on the connections to this node (exposed as a
103+
# Gauge<Integer>). This includes orphaned streams.
104+
pool.in-flight,
105+
106+
# The number of "orphaned" stream ids on the connections to this node (exposed as a
107+
# Gauge<Integer>).
108+
#
109+
# See the description of the connection.max-orphan-requests option for more details.
110+
pool.orphaned-streams,
111+
]
112+
}
113+
```
114+
115+
In particular, it's a good idea to keep an eye on those two metrics:
116+
117+
* `pool.open-connections`: if this doesn't match your configured pool size, something is preventing
118+
connections from opening (either configuration or network issues, or a server-side limitation --
119+
see [CASSANDRA-8086]);
120+
* `pool.available-streams`: if this is often close to 0, it's a sign that the pool is getting
121+
saturated. Maybe `max-requests-per-connection` is too low, or more connections should be added.
122+
123+
### Tuning
124+
125+
The driver defaults should be good for most scenarios.
126+
127+
In our experience, raising `max-requests-per-connection` above 1024 does not bring any significant
128+
improvement: the server is only going to service so many requests at a time anyway, so additional
129+
requests are just going to pile up.
130+
131+
Similarly, 1 connection per node is generally sufficient. However, it might become a bottleneck in
132+
very high performance scenarios: all I/O for a connection happens on the same thread, so it's
133+
possible for that thread to max out its CPU core. In our benchmarks, this happened with a
134+
single-node cluster and a high throughput (approximately 80K requests / second / connection).
135+
136+
It's unlikely that you'll run into this issue: in most real-world deployments, the driver connects
137+
to more than one node, so the load will spread across more I/O threads. However if you suspect that
138+
you experience the issue, here's what to look out for:
139+
140+
* the driver throughput plateaus but the process does not appear to max out any system resource (in
141+
particular, overall CPU usage is well below 100%);
142+
* one of the driver's I/O threads maxes out its CPU core. You can see that with a profiler, or
143+
OS-level tools like `pidstat -tu` on Linux. With the default configuration, I/O threads are called
144+
`<session_name>-io-<n>`.
145+
146+
Try adding more connections per node. Thanks to the driver's hot-reload mechanism, you can do that
147+
at runtime and see the effects immediately.
148+
149+
[CqlSession]: http://docs.datastax.com/en/drivers/java/4.0/com/datastax/oss/driver/api/core/CqlSession.html
150+
[CASSANDRA-8086]: https://issues.apache.org/jira/browse/CASSANDRA-8086

0 commit comments

Comments
 (0)