If you haven't already installed efk stack you can do so like this:

helm repo add cryptexlabs https://helm.cryptexlabs.comhelm install my-efk-stack cryptexlabs/efk

Or add to your Chart.yaml dependencies

  - name: efk    version: 7.8.0    repository:    condition: efk.enabled

Next create a configmap which will also contain your AWS secrets

apiVersion: v1kind: ConfigMapmetadata:  name: fluentd-extra-configdata:  s3.conf: |-    <match **>      @type copy      copy_mode deep      <store>        @type s3        aws_key_id xxx        aws_sec_key xxx        s3_bucket "#{ENV['AWS_S3_BUCKET']}"        s3_region "#{ENV['AWS_REGION']}"        path "#{ENV['S3_LOGS_BUCKET_PREFIX']}"        buffer_path /var/log/fluent/s3        s3_object_key_format %{path}%{time_slice}/cluster-log-%{index}.%{file_extension}        time_slice_format %Y%m%d-%H        time_slice_wait 10m        flush_interval 60s        buffer_chunk_limit 256m      </store>    </match>

Optionally create a secret with your AWS access key and id, see below for more info. Don't forget that opaque secrets must be base64 encoded

apiVersion: v1kind: Secretmetadata:  name: s3-log-archive-secrettype: Opaquedata:  AWS_ACCESS_KEY_ID: xxx  AWS_SECRET_ACCESS_KEY: xxx

If you're wondering why I didn't use an environment variable for the aws access key and id, well its because it doesn't work: If you're using kube-2-iam or kiam then this wouldn't matter. See the documentation for the fluentd s3 pluging to configure it to assume a role instead of use credentials.

These values will allow you to run the s3 plugin with the config map. Some important things to note:

  • I use antiAffinity of "soft" because I run a single instance metal cluster.
  • S3_LOGS_BUCKET_PREFIX is empty because I use a separate bucket for each environment but you could share a bucket for environments and set the prefix as the environment name
  • You need a docker image that extends the fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch image and has the s3 plugin installed on it.
  • If you skipped the step to create a secret for access key and id then you can remove the envFrom that imports the secret as environment variables.
efk:  enabled: true  elasticsearch:    antiAffinity: "soft"  fluentd:    env:      - name: FLUENT_ELASTICSEARCH_HOST        value: "elasticsearch-master"      - name: FLUENT_ELASTICSEARCH_PORT        value: "9200"      - name: AWS_REGION        value: us-east-1      - name: AWS_S3_BUCKET        value: your_buck_name_goes_here      - name: S3_LOGS_BUCKET_PREFIX        value: ""    envFrom:      - secretRef:          name: s3-log-archive-secret    extraVolumeMounts:      - name: extra-config        mountPath: /fluentd/etc/conf.d    extraVolumes:      - name: extra-config        configMap:          name: fluentd-extra-config          items:            - key: s3.conf              path: s3.conf    image:      repository:      tag: k8s-daemonset-elasticsearch-s3

If you want to make your own docker image you can do so like so:

FROM fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearchRUN fluent-gem install \ fluent-plugin-s3

Next thing is that you probably want to set a retention period for the s3 data. Either you want to delete it after a certain period of time or move it to Glacier depending on your requirements.

Finally since we have a longer term retention of our logs in S3 we can safely set a retention period of something smaller like 30 days for the data that is sent to elasticsearch using ElasticSearch Curator.

You can install currator like so:

helm repo add stable install curator stable/elasticsearch-curator

Or add to your Chart.yaml dependencies:

  - name: elasticsearch-curator    version: 2.1.5    repository:


elasticsearch-curator:  configMaps:    action_file_yml: |-      1: &delete        action: delete_indices        description: "Delete selected indices"        options:          ignore_empty_list: True          continue_if_exception: True          timeout_override: 300        filters:        - filtertype: pattern          kind: prefix          value: 'logstash-'        - filtertype: age          source: name          direction: older          timestring: '%Y-%m-%d'          unit: days          unit_count: 30