Which Elasticsearch Indices Need Optimisation? Which Elasticsearch Indices Need Optimisation? elasticsearch elasticsearch

Which Elasticsearch Indices Need Optimisation?


Indices in ES are, basically, files on disk. Every time an index operation is performed, a document is appended to such a file or to a new segment file (depending on the refresh period). The optimization process merges smaller Lucene segments into larger segments.

When a delete operation or an update operation (an update = delete the old version of the document and reindex the new version of the document) is performed on an index the document isn't actually deleted, but marked for deletion. Whenever a merging operations kicks in then it's the time to actually delete the "marked as deleted" documents.

This is why looking at the number of deleted documents and then merging improves the disk allocation space. Usually, the optimize operation is not needed, it is performed automatically by ES. If you really want to do it, beware that it consumes IO and CPU cycles. One scenario when this can be useful is for those indices that are unlikely to change in the future (logs from the past for example). Doing this manually in other scenarios is not recommended.

"Which indices need optimisation?" - those that you know are unlikely to ever change (no writes to them anymore). Ideally, one index is good to have only one segment (searching an index with only one segment is performing better than searching an index composed of multiple segments).

Also, I suggest this reading about optimization.