You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Sep 21, 2021. It is now read-only.
Copy file name to clipboardexpand all lines: 500_Cluster_Admin/40_other_stats.asciidoc
+73-73
Original file line number
Diff line number
Diff line change
@@ -1,18 +1,18 @@
1
1
2
2
=== Cluster Stats
3
3
4
-
The _Cluster Stats_ API provides very similar output to the Node Stats.((("clusters", "administration", "Cluster Stats API"))) There
5
-
is one crucial difference: Node Stats shows you statistics per-node, while
6
-
Cluster Stats will show you the sum total of all nodes in a single metric.
7
-
8
-
This provides some useful stats to glance at. You can see that your entire cluster
9
-
is using 50% available heap, filter cache is not evicting heavily, etc. It's
10
-
main use is to provide a quick summary which is more extensive than
11
-
the Cluster Health, but less detailed than Node Stats. It is also useful for
12
-
clusters which are very large, which makes Node Stats output difficult
4
+
The `cluster-stats` API provides similar output to the `node-stats`.((("clusters", "administration", "Cluster Stats API"))) There
5
+
is one crucial difference: Node Stats shows you statistics pernode, while
6
+
`cluster-stats` shows you the sum total of all nodes in a single metric.
7
+
8
+
This provides some useful stats to glance at. You can see for example, that your entire cluster
9
+
is using 50% of the available heap or that filter cache is not evicting heavily. Its
10
+
main use is to provide a quick summary that is more extensive than
11
+
the `cluster-health`, but less detailed than `node-stats`. It is also useful for
12
+
clusters that are very large, which makes `node-stats` output difficult
13
13
to read.
14
14
15
-
The API may be invoked with:
15
+
The API may be invoked as follows:
16
16
17
17
[source,js]
18
18
----
@@ -21,16 +21,16 @@ GET _cluster/stats
21
21
22
22
=== Index Stats
23
23
24
-
So far, we have been looking at _node-centric_ statistics.((("indexes", "index statistics")))((("clusters", "administration", "index stats"))) How much memory does
24
+
So far, we have been looking at _node-centric_ statistics:((("indexes", "index statistics")))((("clusters", "administration", "index stats"))) How much memory does
25
25
this node have? How much CPU is being used? How many searches is this node
26
-
servicing? Etc. etc.
26
+
servicing?
27
27
28
-
Sometimes it is useful to look at statistics from an _index-centric_ perspective.
28
+
Sometimes it is useful to look at statistics from an _index-centric_ perspective:
29
29
How many search requests is _this index_ receiving? How much time is spent fetching
30
-
docs in _that index_, etc.
30
+
docs in _that index_?
31
31
32
32
To do this, select the index (or indices) that you are interested in and
33
-
execute an Index Stats API:
33
+
execute an Index `stats` API:
34
34
35
35
[source,js]
36
36
----
@@ -40,21 +40,21 @@ GET my_index,another_index/_stats <2>
40
40
41
41
GET _all/_stats <3>
42
42
----
43
-
<1> Stats for `my_index`
44
-
<2> Stats for multiple indices can be requested by comma separating their names
45
-
<3> Stats indices can be requested using the special `_all` index name
43
+
<1> Stats for `my_index`.
44
+
<2> Stats for multiple indices can be requested by separating their names with a comma.
45
+
<3> Stats indices can be requested using the special `_all` index name.
46
46
47
-
The stats returned will be familar to the Node Stats output: search, fetch, get,
48
-
index, bulk, segment counts, etc
47
+
The stats returned will be familar to the `node-stats` output: `search` `fetch` `get`
48
+
`index` `bulk` `segment counts` and so forth
49
49
50
-
Index-centric stats can be useful for identifying or verifying "hot" indices
51
-
inside your cluster, or trying to determine while some indices are faster/slower
50
+
Index-centric stats can be useful for identifying or verifying _hot_ indices
51
+
inside your cluster, or trying to determine why some indices are faster/slower
52
52
than others.
53
53
54
54
In practice, however, node-centric statistics tend to be more useful. Entire
55
55
nodes tend to bottleneck, not individual indices. And because indices
56
56
are usually spread across multiple nodes, index-centric statistics
57
-
are usually not very helpful because it aggregates different physical machines
57
+
are usually not very helpful because they aggregate data from different physical machines
58
58
operating in different environments.
59
59
60
60
Index-centric stats are a useful tool to keep in your repertoire, but are not usually
@@ -63,16 +63,16 @@ the first tool to reach for.
63
63
=== Pending Tasks
64
64
65
65
There are certain tasks that only the master can perform, such as creating a new ((("clusters", "administration", "Pending Tasks API")))
66
-
index or moving shards around the cluster. Since a cluster can only have one
67
-
master, only one node can ever process cluster-level metadata changes. In
66
+
index or moving shards around the cluster. Since a cluster can have only one
67
+
master, only one node can ever process cluster-level metadata changes. For
68
68
99.9999% of the time, this is never a problem. The queue of metadata changes
69
69
remains essentially zero.
70
70
71
-
In some _very rare_ clusters, the number of metadata changes occurs faster than
72
-
the master can process them. This leads to a build up of pending actions which
71
+
In some _rare_ clusters, the number of metadata changes occurs faster than
72
+
the master can process them. This leads to a buildup of pending actions that
73
73
are queued.
74
74
75
-
The _Pending Tasks_ API ((("Pending Tasks API")))will show you what (if any) cluster-level metadata changes
75
+
The `pending-tasks` API ((("Pending Tasks API")))will show you what (if any) cluster-level metadata changes
76
76
are pending in the queue:
77
77
78
78
[source,js]
@@ -89,7 +89,7 @@ Usually, the response will look like this:
89
89
}
90
90
----
91
91
92
-
Meaning there are no pending tasks. If you have one of the rare clusters that
92
+
This means there are no pending tasks. If you have one of the rare clusters that
93
93
bottlenecks on the master node, your pending task list may look like this:
94
94
95
95
[source,js]
@@ -122,50 +122,50 @@ bottlenecks on the master node, your pending task list may look like this:
122
122
----
123
123
124
124
You can see that tasks are assigned a priority (`URGENT` is processed before `HIGH`,
125
-
etc), the order it was inserted, how long the action has been queued and
126
-
what the action is trying to perform. In the above list, there is a Create Index
127
-
action and two Shard Started actions pending.
125
+
for example), the order it was inserted, how long the action has been queued and
126
+
what the action is trying to perform. In the preceding list, there is a `create-index`
127
+
action and two `shard-started` actions pending.
128
128
129
-
.When should I worry about Pending Tasks?
129
+
.When Should I Worry About Pending Tasks?
130
130
****
131
131
As mentioned, the master node is rarely the bottleneck for clusters. The only
132
-
time it can potentially bottleneck is if the cluster state is both very large
132
+
time it could bottleneck is if the cluster state is both very large
133
133
_and_ updated frequently.
134
134
135
135
For example, if you allow customers to create as many dynamic fields as they wish,
136
136
and have a unique index for each customer every day, your cluster state will grow
137
137
very large. The cluster state includes (among other things) a list of all indices,
138
138
their types, and the fields for each index.
139
139
140
-
So if you have 100,000 customers, and each customer averages 1000 fields and 90
141
-
days of retention....that's nine billion fields to keep in the cluster state.
140
+
So if you have 100,000 customers, and each customer averages 1,000 fields and 90
141
+
days of retention--that's nine billion fields to keep in the cluster state.
142
142
Whenever this changes, the nodes must be notified.
143
143
144
-
The master must process these changes which requires non-trivial CPU overhead,
144
+
The master must process these changes, which requires nontrivial CPU overhead,
145
145
plus the network overhead of pushing the updated cluster state to all nodes.
146
146
147
-
It is these clusters which may begin to see clusterstate actions queuing up.
147
+
It is these clusters that may begin to see cluster-state actions queuing up.
148
148
There is no easy solution to this problem, however. You have three options:
149
149
150
150
- Obtain a beefier master node. Vertical scaling just delays the inevitable,
151
-
unfortunately
151
+
unfortunately.
152
152
- Restrict the dynamic nature of the documents in some way, so as to limit the
153
-
clusterstate size.
154
-
- Spin up another cluster once a certain threshold has been crossed.
153
+
cluster-state size.
154
+
- Spin up another cluster after a certain threshold has been crossed.
155
155
****
156
156
157
-
=== Cat API
157
+
=== cat API
158
158
159
-
If you work from the command line often, the _Cat_ APIs will be very helpful
160
-
to you.((("Cat API")))((("clusters", "administration", "Cat API"))) Named after the linux `cat` command, these APIs are designed to be
161
-
work like *nix commandline tools.
159
+
If you work from the command line often, the `cat` APIs will be helpful
160
+
to you.((("Cat API")))((("clusters", "administration", "Cat API"))) Named after the linux `cat` command, these APIs are designed to
161
+
work like *nix command-line tools.
162
162
163
163
They provide statistics that are identical to all the previously discussed APIs
164
-
(Health, Node Stats, etc), but present the output in tabular form instead of
165
-
JSON. This is _very_ convenient as a system administrator and you just want
166
-
to glance over your cluster, or find nodes with high memory usage, etc.
164
+
(Health, `node-stats`, and so forth), but present the output in tabular form instead of
165
+
JSON. This is _very_ convenient for a system administrator, and you just want
166
+
to glance over your cluster or find nodes with high memory usage.
167
167
168
-
Executing a plain GET against the Cat endpoint will show you all available
168
+
Executing a plain `GET` against the `cat` endpoint will show you all available
169
169
APIs:
170
170
171
171
[source,bash]
@@ -207,9 +207,9 @@ GET /_cat/health
207
207
----
208
208
209
209
The first thing you'll notice is that the response is plain text in tabular form,
210
-
not JSON. The second thing you'll notices is that there are no column headers
210
+
not JSON. The second thing you'll notice is that there are no column headers
211
211
enabled by default. This is designed to emulate *nix tools, since it is assumed
212
-
that once you become familiar with the output you no longer want to see
212
+
that once you become familiar with the output, you no longer want to see
0 commit comments