deleting old indexes in amazon elasticsearch deleting old indexes in amazon elasticsearch elasticsearch elasticsearch

deleting old indexes in amazon elasticsearch


Elasticsearch 6.6 brings a new technology called Index Lifecycle Manager See here. Each index is assigned a lifecycle policy, which governs how the index transitions through specific stages until they are deleted.

For example, if you are indexing metrics data from a fleet of ATMs into Elasticsearch, you might define a policy that says:

  1. When the index reaches 50GB, roll over to a new index.
  2. Move the old index into the warm stage, mark it read only, and shrink it down to a single shard.
  3. After 7 days, move the index into the cold stage and move it to less expensive hardware.
  4. Delete the index once the required 30 day retention period is reached.

The technology is in beta stage yet, however is probably the way to go from now on.


Running curator is pretty light and easy.

Here you can find a Dockerfile, config and action-file.

https://github.com/zakkg3/curator

Also, Curator can help you if you need to (among others):

  • Add or remove indices (or both!) from an alias
  • Change shard routing allocation
  • Delete snapshots
  • Open closed indices
  • forceMerge indices
  • reindex indices, including from remote clusters
  • Change the number of replicas per shard for indices
  • rollover indices
  • Take a snapshot (backup) of indices
  • Restore snapshots

https://www.elastic.co/guide/en/elasticsearch/client/curator/current/index.html

Here is a typical action file for delete indices older than 15 days:

     actions:      1:        action: delete_indices        description: >-          Delete indices older than 15 days (based on index name), for logstash-          prefixed indices. Ignore the error if the filter does not result in an          actionable list of indices (ignore_empty_list) and exit cleanly.        options:          ignore_empty_list: True          disable_action: True        filters:        - filtertype: pattern          kind: prefix          value: logstash-        - filtertype: age          source: name          direction: older          timestring: '%Y.%m.%d'          unit: days          unit_count: 15


I followed the elasticsearch-curator documentation to install the package:

https://www.elastic.co/guide/en/elasticsearch/client/curator/current/pip.html

Then I used the AWS base example of how to automate the indexes cleanup using the signed based authentication provided by requests_aws4auth package:

https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/curator.html

It worked like a charm.

You can decide to run this inside a lambda, docker or include it in your own DevOps cli.