fd, dont touch section

polyfractal · polyfractal · commit 763b31f46280 · 2014-08-28T22:02:15.000-04:00
diff --git a/510_Deployment.asciidoc b/510_Deployment.asciidoc
@@ -11,6 +11,8 @@ include::510_Deployment/40_config.asciidoc[]
 
 include::510_Deployment/50_heap.asciidoc[]
 
+include::510_Deployment/60_file_descriptors.asciidoc[]
+
 === Post-Deployment
 
 -Prereqs
diff --git a/510_Deployment/30_other.asciidoc b/510_Deployment/30_other.asciidoc
@@ -35,36 +35,6 @@ clusters, the first step is often to remove all custom configurations.  About
 half the time this alone restores stability and performance.
 ****
 
-=== Garbage Collector
-
-As briefly introduces in <<garbage_collector_primer>>, the JVM uses a garbage
-collector to free unused memory.  This tip is really an extension of the last tip,
-but deserves it's own section for emphasis:
-
-Do not change the default garbage collector!
-
-The default GC for Elasticsearch is Concurrent-Mark and Sweep (CMS).  This GC
-runs concurrently with the execution of the application so that it can minimize
-pauses.  It does, however, have two stop-the-world phases.  It also has trouble
-collecting large heaps.
-
-Despite these downsides, it is currently the best GC for low-latency server software
-like Elasticsearch.  The official recommendation is to use CMS.
-
-There is a newer GC called the Garbage First GC (G1GC).  This newer GC is designed
-to minimize pausing even more than CMS, and operate on large heaps.  It works
-by dividing the heap into regions and predicting which regions contain the most
-reclaimable space.  By collecting those regions first ("garbage first"), it can
-minimize pauses and operate on very large heaps.
-
-Sounds great!  Unfortunately, G1GC is still new and fresh bugs are found routinely.
-These bugs are usually of the segfault variety, and will cause hard crashes.
-The Lucene test suite is brutal on GC algorithms, and it seems that G1GC hasn't
-had the kinks worked out yet.
-
-We would like to recommend G1GC someday, but for now, it is simply not stable
-enough to meet the demands of Elasticsearch and Lucene.
-
 === TransportClient vs NodeClient
 
 If you are using Java, you may wonder when to use the TransportClient vs the
diff --git a/510_Deployment/45_dont_touch.asciidoc b/510_Deployment/45_dont_touch.asciidoc
@@ -0,0 +1,87 @@
+
+=== Don't touch these settings!
+
+There are a few hotspots in Elasticsearch that people just can't seem to avoid
+tweaking.  We understand:  knobs just beg to be turned.
+
+But of all the knobs to turn, these you should _really_ leave alone.  They are
+often abused and will contribute to terrible stability or terrible performance.
+Or both.
+
+==== Garbage Collector
+
+As briefly introduces in <<garbage_collector_primer>>, the JVM uses a garbage
+collector to free unused memory.  This tip is really an extension of the last tip,
+but deserves it's own section for emphasis:
+
+Do not change the default garbage collector!
+
+The default GC for Elasticsearch is Concurrent-Mark and Sweep (CMS).  This GC
+runs concurrently with the execution of the application so that it can minimize
+pauses.  It does, however, have two stop-the-world phases.  It also has trouble
+collecting large heaps.
+
+Despite these downsides, it is currently the best GC for low-latency server software
+like Elasticsearch.  The official recommendation is to use CMS.
+
+There is a newer GC called the Garbage First GC (G1GC).  This newer GC is designed
+to minimize pausing even more than CMS, and operate on large heaps.  It works
+by dividing the heap into regions and predicting which regions contain the most
+reclaimable space.  By collecting those regions first ("garbage first"), it can
+minimize pauses and operate on very large heaps.
+
+Sounds great!  Unfortunately, G1GC is still new and fresh bugs are found routinely.
+These bugs are usually of the segfault variety, and will cause hard crashes.
+The Lucene test suite is brutal on GC algorithms, and it seems that G1GC hasn't
+had the kinks worked out yet.
+
+We would like to recommend G1GC someday, but for now, it is simply not stable
+enough to meet the demands of Elasticsearch and Lucene.
+
+==== Threadpools
+
+Everyone _loves_ to tweak threadpools.  For whatever reason, it seems people
+cannot resist increasing thread counts.  Indexing a lot?  More threads!  Searching
+a lot? More threads!  Node idling 95% of the time?  More threads!
+
+The default threadpool settings in Elasticsearch are very sensible.  For all
+threadpools (except `search`) the threadcount is set to the number of CPU cores.
+If you have 8 cores, you can only be running 8 threads simultaneously.  It makes
+sense to only assign 8 threads to any particular threadpool.
+
+Search gets a larger threadpool, and is configured to `# cores * 3`. 
+
+You might argue that some threads can block (such as on a disk I/O operation), 
+which is why you need more threads.  This is not actually not a problem in Elasticsearch:
+much of the disk IO is handled by threads managed by Lucene, not Elasticsearch.
+
+Furthermore, threadpools cooperate by passing work between each other.  You don't
+need to worry about a networking thread blocking because it is waiting on a disk
+write.  The networking thread will have long since handed off that work unit to
+another threadpool and gotten back to networking.
+
+Finally, the compute capacity of your process is finite.  More threads just forces
+the processor to switch thread contexts.  A processor can only run one thread
+at a time, so when it needs to switch to a different thread, it stores the current
+state (registers, etc) and loads another thread.  If you are lucky, the switch
+will happen on the same core.  If you are unlucky, the switch may migrate to a
+different core and require transport on inter-core communication bus.
+
+This context switching eats up cycles simply doing administrative housekeeping
+ -- estimates can peg it as high as 30us on modern CPUs.  So unless the thread
+ will be blocked for longer than 30us, it is highly likely that that time would
+ have been better spent just processing and finishing early.
+
+ People routinely set threadpools to silly values.  On 8 core machines, we have
+ run across configs with 60, 100 or even 1000 threads.  These settings will simply
+ thrash the CPU more than getting real work done.
+
+ So. Next time you want to tweak a threadpool...please don't.  And if you
+ _absolutely cannot resist_, please keep your core count in mind and perhaps set
+ the count to double.  More than that is just a waste.
+
+
+
+
+
+
diff --git a/510_Deployment/60_file_descriptors.asciidoc b/510_Deployment/60_file_descriptors.asciidoc
@@ -0,0 +1,61 @@
+
+=== File Descriptors and MMap 
+
+Lucene uses are _very_ large number of files.  At the same time, Elasticsearch
+uses a large number of sockets to communicate between nodes and HTTP clients.
+All of this requires available file descriptors.
+
+Sadly, many modern linux distributions ship with a paltry 1024 file descriptors
+allowed per process.  This is _far_ too low for even a small Elasticsearch
+node, let alone one that is handling hundreds of indices.
+
+You should increase your file descriptor count to something very large, such as
+64,000.  This process is irritatingly difficult and highly dependent on your
+particular OS and distribution.  Consult the documentation for your OS to determine
+how best to change the allowed file descriptor count.
+
+Once you think you've changed it, check Elasticsearch to make sure it really does
+have enough file descriptors.  You can check with:
+
+[source,js]
+----
+GET /_nodes/process
+
+{
+   "cluster_name": "elasticsearch__zach",
+   "nodes": {
+      "TGn9iO2_QQKb0kavcLbnDw": {
+         "name": "Zach",
+         "transport_address": "inet[/192.168.1.131:9300]",
+         "host": "zacharys-air",
+         "ip": "192.168.1.131",
+         "version": "2.0.0-SNAPSHOT",
+         "build": "612f461",
+         "http_address": "inet[/192.168.1.131:9200]",
+         "process": {
+            "refresh_interval_in_millis": 1000,
+            "id": 19808,
+            "max_file_descriptors": 64000, <1>
+            "mlockall": true
+         }
+      }
+   }
+}
+----
+<1> The `max_file_descriptors` field will inform you how many available descriptors
+the Elasticsearch process can access
+
+Elasticsearch also uses a mix of NioFS and MMapFS for the various files.  Ensure
+that the maximum map count so that there is ample virtual memory available for 
+mmapped files.  This can be set temporarily with:
+
+[source,js]
+----
+sysctl -w vm.max_map_count=262144
+----
+
+Or permanently by modifying `vm.max_map_count` setting in your `/etc/sysctl.conf`
+
+
+
+