Ranter
Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Comments
-
We would need far more data...
Memory allocation in elasticsearch is a very complicated thing.
The memory is broken up in elasticsearch in several segments which may or may not belong to Elasticsearch itself.
5.6 is ancient... If you can I highly rate to upgrade to 6 to have an upgrade path to 7 - 8 is already in process and will become reality soon.
JDK 8 and G1GC isn't a silver bullet. Far from it... It's an entirely different thing than G1GC in JDK 11. People tend to treat it the same - it's not.
JDK 8 is hopefully up to date? 292 if repository, 302 otherwise?
https://wiki.openjdk.java.net/displ...
If not, stop immediately and update first.
Lots of stuff was backported to JDK 8 which could help.
The JVM settings for Elasticsearch I use are backported from the Elasticsearch main repository.
I would disrecommend fiddling with them - as Elasticsearch allocates and uses the JDK heap for several entirely different things, you most likely are making it worse than better....
https://pastebin.com/12xFgaM5
You should check the environment - I wrote SystemD Units to start elasticsearch, ensuring that the Limits / sysctl configuration is sane.
Now to the interesting parts: Disable xpack monitoring. Check if problem persists.
The monitoring via Marvel / XPack sucks bonkers in ES. In 7 they fixed several things, especially regarding it's resource usage.
I'd recommend prometheus with a plugin for that.
Then take a look at the cache statistics:
https://elastic.co/guide/en/...
You find most of the interesting caches in the Modules - Indices Doc Category....
It's analyzing node statistics, analyzing usage of caches, analyzing query behaviour.
There is seldomly a golden bullet here...
Grafana / Monitoring is the key to find out why GCing is necessary.
GCs aren't necessarily a bad thing though - except when (as you said) they're running multiple dozens of seconds. -
Regarding shards: Shard size might play a role.
If the shard size is extremely diverse in the cluster, e.g.
Index 1 - 6 Shards, each 5 GB
Index 2 - 5 Shards, each 30 GB
Index 3 - 8 Shards, each 50 GB
You can have the funny problem of smaller shards load "exiling" larger shards in memory.
It makes sense to manually partition the cluster by node attributes / index settings to distribute a near equal shard size for indexes on nodes.
E.g.
Node 1 - 2 hosts indexes with max. shard size of 10 GB
Node 3 - 4 hosts indexes with max. shard size of 30 GB
....
Related Rants
Any Elasticsearch gurus here? I have a box with too many young gen GCs (one per 2 or 3 seconds), and irregular, very long old gen GCs (One per several hours, taking around a minute and freeing about 2/3's of the old gen space) -- I was thinking changing the new gen ratio from 2/3 to something like 3/4 or 4/5.
However, after reading an elastic article about settings to never touch... I'm no longer so sure...
Only other option I was considering is going from CMS to G1GC to cut back on the old gen GC time... A minute long downtime for Elastic is rather problematic.
Any thoughts? The box is rather old - running Elastic 5.6 with 20 GBs of heap, 207 shards and 306k docs.
question
heap
elasticsearch
jvm