Saturday 21 November 2020

Elasticsearch shard allocation failure

 Recently, I found a shard allocation failure in an Elasticsearch cluster. Checking the reason for the failure, I found the reason to be the following:

"Validation Failed: 1: this action would add [4] total shards, but this cluster currently has [4000]/[4000] maximum shards open;"

It is a bit unclear from the error message why the cluster considers 4000 as the maximum number of shards it should consider. Especially, given that on a different cluster I get 16000 as the maximum number of shards.

Looking at my cluster settings (including defaults), I found a property called "max_shards_per_node" and it was set to "1000". Since, I had 4 nodes, it added up perfectly.

Generally, this setting is put in so that the nodes have a proper balance between different types of work loads. However, we knew our scenarios of more shards was temporary so increasing the value to 1500 worked for the short term.

No comments: