Find ES bottleneck in bulk (bigdesk screenshots attached) Find ES bottleneck in bulk (bigdesk screenshots attached) elasticsearch elasticsearch

Find ES bottleneck in bulk (bigdesk screenshots attached)


So Nate's script was (among others) reducing the refresh interval. Let me add some other findings as well:

The refresh rate was stressing the cluster however I continued searching and found more "errors". One gotcha was that I have a deprecated S3.Gateway. S3 is persistent but slower than the EC2 volume.

Not only did I have S3 as data storage but on a different region (ec2 virginia -> s3 oregon). So sending documents over the network. I got down to that because some old tutorials have S3 as cloud data storage option.

After solving that, the "Documents deleted" below was better. When I was using S3 it was like 30%. This is from Elasticsearch HQ plugin.

FS Ops

Since now we have optimized I/O. Let's see what else we can do.

I found out that CPU is an issue. Although big desk says that the workload was minimal, t1.micros are not to used for persistent CPU usage. That means that although on the charts CPU it is not fully used that's because Amazon throttles it in intervals and in reality they are fully used.

If you put a big more complex documents it will stress the server.

Happy dev-oping.


Can you run the IndexPerfES.sh script against index you are bulk indexing to, we can then see if the performance improves. I think that the refresh rate is degrading performance and is perhaps causing stress on the cluster, leading to problems. Let me know and we can work this out.