Shard unassigned after failing to flush in elastic search

So first of all you appear to be running Elasticsearch with the indexes being created on the root partition:

Filesystem      Size  Used Avail Use% Mounted on/dev/vda         20G   13G  6.6G  65% /udev            237M   12K  237M   1% /devtmpfs            50M  216K   49M   1% /runnone            5.0M     0  5.0M   0% /run/locknone            246M     0  246M   0% /run/shm

Generally not the best idea if you can afford to mount another drive.

You failed during a Lucene index segment merge which is generally going to require significant free space. With disk space usage already at 65% on a very small partition of only 20G you can easily run out of space particularly since you are competing with the disk needs of all other processes at the same time. There is more detail here on managing and configuring the Elasticsearch merge policy:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-merge.html

You are probably not going to be able to reliably index and manage 9 GB of data on a 20GB partition that is also the root partition, particularly if you change the data a lot. You can try to set it up to avoid/reduce segment merges which can help with disk space but this still may not work.

Regarding why it takes up as much space as it does this is a function of how you are mapping your data, but in general Elasticsearch defaults to storing a copy of all the data in it's original form, plus all of the indexes for each individual field.

If you really, really need to fit into a 20GB system I'd take a close look at your mappings and see which fields you can either not index or not store -

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-source-field.html http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-all-field.html

java memory elasticsearch

The problem really was in disc space. For unknown reason, ES take all free disc space.Here's what happened:

I add about 75000 documents via bulk API in index (all successfull).
Then do not touching ES at all. And monitoring disc space.
During 5 minutes all space was taken by few files in /var/lib/elasticsearch/elasticsearch/nodes/0/indeces/cvk/0/index/The most space took file _3ya.fdt (3gig)And right before loosing shard there were files named _3ya_es090_0 with extensions like .tim .pos .doc about 400mb each.After loosing shard all those files gone.

So the obvious solution is to add disc memory.

But the new questions:

Why ES takes x10 disc space than the size of data being added???
Is there a way to know when to stop add new documents in existing shard?
will it help if we create several shards instead of one?
any other suggestion how to get maximum from current server? Server has 20gig of space. We only need to index about 9gig of data for small research.

CodeHunter

Shard unassigned after failing to flush in elastic search

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last