ElasticSearch and a Kubernetes Curator

What about automatic old-indices deletion on Elasticsearch?

We don’t have it, sir…. but we can give you a beautiful Python script to do that….

What’s curator?

A Python script that performs tasks on Elasticsearch.

We suppose we are working against an Elasticsearch Cloud, but you can adapt it to an other type of Elasticsearch deploy. The tasks we want to perform are:

  • Close indices older than 30 days
  • Delete indices older than 60 days

There are other tasks you can perform, but with these two you can have an idea of what curator is capable of.

The close task must be evaluated due to this disclaimer on Elastic site:

Enables closing indices in Elasticsearch version 2.2 and later. You might enable this setting temporarily in order to change the analyzer configuration for an existing index. We strongly recommend leaving this set to false (the default) otherwise. Closed indices are a data loss risk: If you close an index, it is not included in snapshots and you will not be able to restore the data. Similarly, closed indices are not included when you when you make cluster configuration changes, such as scaling to a different capacity, failover, and many other operations. Lastly, closed indices can lead to inaccurate disk space counts.

So, in the following text keep in mind that “close” task is not enabled but is there in case you want to use it.

Shhh… it’s secret

Ok, first we need to have a secret between us… to hold the Elasticsearch creadentials. (it is a good practice to keep the secrets secret)

This is the yaml to create a secret:

Replace the values with their base64 representations and apply the file.

Remember to use this way to get your base64 strings:

This will be the example’s result:

Copy your string without the final %. In this case this is what you need:

Cronjob

Since this task must be run once every n days (e.g. once per month), we use a Kubernetes Cronjob. (curator can be run stand alone or just in Docker)

The curator docker image

We are using a custom image due to the version we want and to add a little script as container’s command.

This script replaces Elasticsearch credentials from envvars (in turn taken from Kubernetes Secret) into curator’s config file.

This script is quite simple:

It will replace the values in config file and then run curator.

It is needed to copy the file from curator.yml.sample to curator.yml, in first instance, due to the mounted file from ConfigMap is readonly. (and we need to modify the Elasticsearch credentials on it)

Finally this is the Dockerfile to the image:

The YAML to apply into the cluster

This is the YAML we are using:

We have two sections here: the configmap and the cronjob.

The configmap

Here we set two sets of data: curator.yml.sample and curator-actions.yml.

The first one has the config parameters (including replacing values).

The second one has the action curator must take. Note the close action has the param disable_action: True due to the discussion needed on when it is correct to allow index close.

Note that the actions are performed on indices that match the following RegExp:

Meaning that if you have created a new index you must add it explicitly to the RegExp.

The CronJob

The cronjob has the following schedule (can be set in other ways, this is the standard cron way):

Meaning that at the minute 5 of the hour 4 of the 15th day of the month (UTC 0) it will be run.


Источник: juanmatiasdelacamara.wordpress.com