Shard unassigned after failing to flush in elastic search Shard unassigned after failing to flush in elastic search elasticsearch elasticsearch

Shard unassigned after failing to flush in elastic search


So first of all you appear to be running Elasticsearch with the indexes being created on the root partition:

Filesystem      Size  Used Avail Use% Mounted on/dev/vda         20G   13G  6.6G  65% /udev            237M   12K  237M   1% /devtmpfs            50M  216K   49M   1% /runnone            5.0M     0  5.0M   0% /run/locknone            246M     0  246M   0% /run/shm

Generally not the best idea if you can afford to mount another drive.

You failed during a Lucene index segment merge which is generally going to require significant free space. With disk space usage already at 65% on a very small partition of only 20G you can easily run out of space particularly since you are competing with the disk needs of all other processes at the same time. There is more detail here on managing and configuring the Elasticsearch merge policy:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-merge.html

You are probably not going to be able to reliably index and manage 9 GB of data on a 20GB partition that is also the root partition, particularly if you change the data a lot. You can try to set it up to avoid/reduce segment merges which can help with disk space but this still may not work.

Regarding why it takes up as much space as it does this is a function of how you are mapping your data, but in general Elasticsearch defaults to storing a copy of all the data in it's original form, plus all of the indexes for each individual field.

If you really, really need to fit into a 20GB system I'd take a close look at your mappings and see which fields you can either not index or not store -

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-source-field.htmlhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-all-field.html


The problem really was in disc space. For unknown reason, ES take all free disc space.Here's what happened:

  1. I add about 75000 documents via bulk API in index (all successfull).

  2. Then do not touching ES at all. And monitoring disc space.

  3. During 5 minutes all space was taken by few files in /var/lib/elasticsearch/elasticsearch/nodes/0/indeces/cvk/0/index/The most space took file _3ya.fdt (3gig)And right before loosing shard there were files named _3ya_es090_0 with extensions like .tim .pos .doc about 400mb each.After loosing shard all those files gone.

So the obvious solution is to add disc memory.

But the new questions:

  1. Why ES takes x10 disc space than the size of data being added???

  2. Is there a way to know when to stop add new documents in existing shard?

  3. will it help if we create several shards instead of one?

  4. any other suggestion how to get maximum from current server? Server has 20gig of space. We only need to index about 9gig of data for small research.