You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Sep 21, 2021. It is now read-only.
Copy file name to clipboardexpand all lines: 510_Deployment/50_heap.asciidoc
+50-50
Original file line number
Diff line number
Diff line change
@@ -1,14 +1,14 @@
1
1
[[heap-sizing]]
2
2
=== Heap: Sizing and Swapping
3
3
4
-
The default installation of Elasticsearch is configured with a 1gb heap. ((("deployment", "heap, sizing and swapping")))((("heap", "sizing and setting"))) For
4
+
The default installation of Elasticsearch is configured with a 1GB heap. ((("deployment", "heap, sizing and swapping")))((("heap", "sizing and setting"))) For
5
5
just about every deployment, this number is far too small. If you are using the
6
6
default heap values, your cluster is probably configured incorrectly.
7
7
8
8
There are two ways to change the heap size in Elasticsearch. The easiest is to
9
9
set an environment variable called `ES_HEAP_SIZE`.((("ES_HEAP_SIZE environment variable"))) When the server process
10
10
starts, it will read this environment variable and set the heap accordingly.
11
-
As an example, you can set it via the command line with:
11
+
As an example, you can set it via the command line as follows:
12
12
13
13
[source,bash]
14
14
----
@@ -23,15 +23,15 @@ the process, if that is easier for your setup:
23
23
./bin/elasticsearch -Xmx=10g -Xms=10g <1>
24
24
----
25
25
<1> Ensure that the min (`Xms`) and max (`Xmx`) sizes are the same to prevent
26
-
the heap from resizing at runtime, a very costly process
26
+
the heap from resizing at runtime, a very costly process.
27
27
28
28
Generally, setting the `ES_HEAP_SIZE` environment variable is preferred over setting
29
29
explicit `-Xmx` and `-Xms` values.
30
30
31
-
==== Give half your memory to Lucene
31
+
==== Give Half Your Memory to Lucene
32
32
33
-
A common problem is configuring a heap that is _too_ large. ((("heap", "sizing and setting", "giving half your memory to Lucene"))) You have a 64gb
34
-
machine...and by golly, you want to give Elasticsearch all 64gb of memory. More
33
+
A common problem is configuring a heap that is _too_ large. ((("heap", "sizing and setting", "giving half your memory to Lucene"))) You have a 64GB
34
+
machine--and by golly, you want to give Elasticsearch all 64GB of memory. More
35
35
is better!
36
36
37
37
Heap is definitely important to Elasticsearch. It is used by many in-memory data
@@ -41,87 +41,87 @@ user of memory that is _off heap_: Lucene.
41
41
Lucene is designed to leverage the underlying OS for caching in-memory data structures.((("Lucene", "memory for")))
42
42
Lucene segments are stored in individual files. Because segments are immutable,
43
43
these files never change. This makes them very cache friendly, and the underlying
44
-
OS will happily keep "hot" segments resident in memory for faster access.
44
+
OS will happily keep hot segments resident in memory for faster access.
45
45
46
46
Lucene's performance relies on this interaction with the OS. But if you give all
47
-
available memory to Elasticsearch's heap, there won't be any leftover for Lucene.
47
+
available memory to Elasticsearch's heap, there won't be any left over for Lucene.
48
48
This can seriously impact the performance of full-text search.
49
49
50
50
The standard recommendation is to give 50% of the available memory to Elasticsearch
51
-
heap, while leaving the other 50% free. It won't go unused...Lucene will happily
52
-
gobble up whatever is leftover.
51
+
heap, while leaving the other 50% free. It won't go unused; Lucene will happily
52
+
gobble up whatever is left over.
53
53
54
54
[[compressed_oops]]
55
-
==== Don't cross 32gb!
55
+
==== Don't Cross 32GB!
56
56
There is another reason to not allocate enormous heaps to Elasticsearch. As it turns((("heap", "sizing and setting", "32gb heap boundary")))((("32gb Heap boundary")))
57
57
out, the JVM uses a trick to compress object pointers when heaps are less than
58
-
~32gb.
58
+
~32GB.
59
59
60
60
In Java, all objects are allocated on the heap and referenced by a pointer.
61
-
Ordinary Object Pointers (oops) point at these objects, and are traditionally
62
-
the size of the CPU's native _word_: either 32 bits or 64 bits depending on the
61
+
Ordinary object pointers (OOP) point at these objects, and are traditionally
62
+
the size of the CPU's native _word_: either 32 bits or 64 bits, depending on the
63
63
processor. The pointer references the exact byte location of the value.
64
64
65
-
For 32bit systems, this means the maximum heap size is 4gb. For 64bit systems,
66
-
the heap size can get much larger, but the overhead of 64bit pointers means there
67
-
is more "wasted" space simply because the pointer is larger. And worse than wasted
65
+
For 32-bit systems, this means the maximum heap size is 4GB. For 64-bit systems,
66
+
the heap size can get much larger, but the overhead of 64-bit pointers means there
67
+
is more wasted space simply because the pointer is larger. And worse than wasted
68
68
space, the larger pointers eat up more bandwidth when moving values between
69
-
main memory and various caches (LLC, L1, etc).
69
+
main memory and various caches (LLC, L1, and so forth).
70
70
71
-
Java uses a trick called "https://wikis.oracle.com/display/HotSpotInternals/CompressedOops[compressed oops]"((("compressed oops")))
71
+
Java uses a trick called https://wikis.oracle.com/display/HotSpotInternals/CompressedOops[compressed oops]((("compressed oops")))
72
72
to get around this problem. Instead of pointing at exact byte locations in
73
-
memory, the pointers reference _object offsets_.((("object offsets"))) This means a 32bit pointer can
74
-
reference 4 billion _objects_, rather than 4 billion bytes. Ultimately, this
75
-
means the heap can grow to around 32gb of physical size while still using a 32bit
73
+
memory, the pointers reference _object offsets_.((("object offsets"))) This means a 32-bit pointer can
74
+
reference four billion _objects_, rather than four billion bytes. Ultimately, this
75
+
means the heap can grow to around 32GB of physical size while still using a 32-bit
76
76
pointer.
77
77
78
-
Once you cross that magical ~30-32gb boundary, the pointers switch back to
78
+
Once you cross that magical ~30 - 32GB boundary, the pointers switch back to
79
79
ordinary object pointers. The size of each pointer grows, more CPU-memory
80
-
bandwidth is used and you effectively "lose" memory. Infact, it takes until around
81
-
40-50gb of allocated heap before you have the same "effective" memory of a 32gb
80
+
bandwidth is used, and you effectively lose memory. In fact, it takes until around
81
+
40 - 50GB of allocated heap before you have the same _effective_ memory of a 32GB
82
82
heap using compressed oops.
83
83
84
84
The moral of the story is this: even when you have memory to spare, try to avoid
85
-
crossing the 32gb Heap boundary. It wastes memory, reduces CPU performance and
85
+
crossing the 32GB heap boundary. It wastes memory, reduces CPU performance, and
86
86
makes the GC struggle with large heaps.
87
87
88
-
.I have a machine with 1TB RAM!
88
+
.I Have a Machine with 1TB RAM!
89
89
****
90
-
The 32gb line is fairly important. So what do you do when your machine has a lot
91
-
of memory? It is becoming increasingly common to see super-servers with 300-500gb
90
+
The 32GB line is fairly important. So what do you do when your machine has a lot
91
+
of memory? It is becoming increasingly common to see super-servers with 300 - 500GB
92
92
of RAM.
93
93
94
-
First, we would recommend to avoid such large machines (see <<hardware>>).
94
+
First, we would recommend avoiding such large machines (see <<hardware>>).
95
95
96
-
But if you already have the machines, you have to practical options:
96
+
But if you already have the machines, you have two practical options:
97
97
98
-
1. Are you doing most full-text search? Consider giving 32gb to Elasticsearch
99
-
and just let Lucene use the rest of memory via the OS file system cache. All that
100
-
memory will cache segments and lead to blisteringly fast full-text search
98
+
- Are you doing mostly full-text search? Consider giving 32GB to Elasticsearch
99
+
and letting Lucene use the rest of memory via the OS filesystem cache. All that
100
+
memory will cache segments and lead to blisteringly fast full-text search.
101
101
102
-
2. Are you doing a lot of sorting/aggregations? You'll likely want that memory
103
-
in the heap then. Instead of one node with 32gb+ of RAM, consider running two or
104
-
more nodes on a single machine. Still adhere to the 50% rule though. So if your
105
-
machine has 128gb of RAM, run two nodes each with 32gb. This means 64gb will be
106
-
used for heaps, and 64 will be leftover for Lucene.
102
+
- Are you doing a lot of sorting/aggregations? You'll likely want that memory
103
+
in the heap then. Instead of one node with 32GB+ of RAM, consider running two or
104
+
more nodes on a single machine. Still adhere to the 50% rule, though. So if your
105
+
machine has 128GB of RAM, run two nodes, each with 32GB. This means 64GB will be
106
+
used for heaps, and 64 will be left over for Lucene.
107
107
+
108
108
If you choose this option, set `cluster.routing.allocation.same_shard.host: true`
109
-
in your config. This will prevent a primary and a replica shard from co-locating
110
-
to the same physical machine (since this would remove the benefits of replica HA)
109
+
in your config. This will prevent a primary and a replica shard from colocating
110
+
to the same physical machine (since this would remove the benefits of replica high availability).
111
111
****
112
112
113
-
==== Swapping is the death of performance
113
+
==== Swapping Is the Death of Performance
114
114
115
115
It should be obvious,((("heap", "sizing and setting", "swapping, death of performance")))((("memory", "swapping as the death of performance")))((("swapping, the death of performance"))) but it bears spelling out clearly: swapping main memory
116
-
to disk will _crush_ server performance. Think about it...an in-memory operation
116
+
to disk will _crush_ server performance. Think about it: an in-memory operation
117
117
is one that needs to execute quickly.
118
118
119
-
If memory swaps to disk, a 100microsecond operation becomes one that take 10
119
+
If memory swaps to disk, a 100-microsecond operation becomes one that take 10
120
120
milliseconds. Now repeat that increase in latency for all other 10us operations.
121
121
It isn't difficult to see why swapping is terrible for performance.
122
122
123
123
The best thing to do is disable swap completely on your system. This can be done
124
-
temporarily by:
124
+
temporarily:
125
125
126
126
[source,bash]
127
127
----
@@ -131,22 +131,22 @@ sudo swapoff -a
131
131
To disable it permanently, you'll likely need to edit your `/etc/fstab`. Consult
132
132
the documentation for your OS.
133
133
134
-
If disabling swap completely is not an option, you can try to lower swappiness.
135
-
This is a value that controls how aggressively the OS tries to swap memory.
134
+
If disabling swap completely is not an option, you can try to lower `swappiness`.
135
+
This value controls how aggressively the OS tries to swap memory.
136
136
This prevents swapping under normal circumstances, but still allows the OS to swap
137
137
under emergency memory situations.
138
138
139
-
For most linux systems, this is configured using the sysctl value:
139
+
For most Linux systems, this is configured using the `sysctl` value:
140
140
141
141
[source,bash]
142
142
----
143
143
vm.swappiness = 1 <1>
144
144
----
145
-
<1> A swappiness of `1` is better than `0`, since on some kernel versions a swappiness
145
+
<1> A `swappiness` of `1` is better than `0`, since on some kernel versions a `swappiness`
146
146
of `0` can invoke the OOM-killer.
147
147
148
148
Finally, if neither approach is possible, you should enable `mlockall`.
149
-
file. This allows the JVM to lock it's memory and prevent
149
+
file. This allows the JVM to lock its memory and prevent
150
150
it from being swapped by the OS. In your `elasticsearch.yml`, set this:
0 commit comments