deleting old indexes in amazon elasticsearch
Elasticsearch 6.6 brings a new technology called Index Lifecycle Manager See here. Each index is assigned a lifecycle policy, which governs how the index transitions through specific stages until they are deleted.
For example, if you are indexing metrics data from a fleet of ATMs into Elasticsearch, you might define a policy that says:
- When the index reaches 50GB, roll over to a new index.
- Move the old index into the warm stage, mark it read only, and shrink it down to a single shard.
- After 7 days, move the index into the cold stage and move it to less expensive hardware.
- Delete the index once the required 30 day retention period is reached.
The technology is in beta stage yet, however is probably the way to go from now on.
Running curator is pretty light and easy.
Here you can find a Dockerfile, config and action-file.
https://github.com/zakkg3/curator
Also, Curator can help you if you need to (among others):
- Add or remove indices (or both!) from an alias
- Change shard routing allocation
- Delete snapshots
- Open closed indices
- forceMerge indices
- reindex indices, including from remote clusters
- Change the number of replicas per shard for indices
- rollover indices
- Take a snapshot (backup) of indices
- Restore snapshots
https://www.elastic.co/guide/en/elasticsearch/client/curator/current/index.html
Here is a typical action file for delete indices older than 15 days:
actions: 1: action: delete_indices description: >- Delete indices older than 15 days (based on index name), for logstash- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. options: ignore_empty_list: True disable_action: True filters: - filtertype: pattern kind: prefix value: logstash- - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: 15
I followed the elasticsearch-curator documentation to install the package:
https://www.elastic.co/guide/en/elasticsearch/client/curator/current/pip.html
Then I used the AWS base example of how to automate the indexes cleanup using the signed based authentication provided by requests_aws4auth package:
https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/curator.html
It worked like a charm.
You can decide to run this inside a lambda, docker or include it in your own DevOps cli.