Just got back from configuring some log storage for 10 TB so let’s talk sharding 😀
Main source: The definitive guide to elasticsearch
HEAP: 32 GB at most:
If the heap is less than 32 GB, the JVM can use compressed pointers, which saves a lot of memory: 4 bytes per pointer instead of 8 bytes.
HEAP: 50% of the server memory at most. The rest is left to filesystem caches (thus 64 GB servers are a common sweet spot):
Lucene makes good use of the filesystem caches, which are managed by the kernel. Without enough filesystem cache space, performance will suffer. Furthermore, the more memory dedicated to the heap means less available for all your other fields using doc values.
[An index split in] N shards can spread the load over N servers:
1 shard can use all the processing power from 1 node (it’s like an independent index). Operations on sharded indices are run concurrently on all shards and the result is aggregated.
Less shards is better (the ideal is 1 shard):
The overhead of sharding is significant. See this benchmark for numbers https://blog.trifork.com/2014/01/07/elasticsearch-how-many-shards/
Less servers is better (the ideal is 1 server (with 1 shard)]):
The load on an index can only be split across nodes by sharding (A shard is enough to use all resources on a node). More shards allow to use more servers but more servers bring more overhead for data aggregation… There is no free lunch.
Usage: A single big index
We put everything in a single big index and let elasticsearch do all the hard work relating to sharding data. There is no logic whatsoever in the application so it’s easier to dev and maintain.
Let’s suppose that we plan for the index to be at most 111 GB in the future and we’ve got 50 GB servers (25 GB heap) from our cloud provider.
That means we should have 5 shards.
Note: Most people tend to overestimate their growth, try to be realistic. For instance, this 111GB example is already a BIG index. For comparison the stackoverflow index is 430 GB (2016) and it’s a top 50 site worldwide, made entirely of written texts by millions of people.
Usage: Index by time
When there’re too much data for a single index or it’s getting too annoying to manage, the next thing is to split the index by time period.
The most extreme example is logging applications (logstach and graylog) which are using a new index every day.
The ideal configuration of 1-single-shard-per-index makes perfect sense in scenario. The index rotation period can be adjusted, if necessary, to keep the index smaller than the heap.
Special case: Let’s imagine a popular internet forum with monthly indices. 99% of requests are hitting the last index. We have to set multiple shards (e.g. 3) to spread the load over multiple nodes. (Note: It’s probably unnecessary optimization. A 99% hitrate is unlikely in the real world and the shard replica could distribute part of the read-only load anyway).
Usage: Going Exascale (just for the record)
ElasticSearch is magic. It’s the easiest database to setup in cluster and it’s one of the very few able to scale to many nodes (excluding Spanner ).
It’s possible to go exascale with hundreds of elasticsearch nodes. There must be many indices and shards to spread the load on that many machines and that takes an appropriate sharding configuration (eventually adjusted per index).
The final bit of magic is to tune elasticsearch routing to target specific nodes for specific operations.