Filebeat Kubernetes Processor and filtering

elasticsearch logging kubernetes kibana filebeat

The conditions need to be a list:

- drop_event.when.regexp:    or:      - kubernetes.pod.name: "weave-net.*"      - kubernetes.pod.name: "external-dns.*"      - kubernetes.pod.name: "nginx-ingress-controller.*"      - kubernetes.pod.name: "filebeat.*"

I'm not sure if your order of parameters works. One of my working examples looks like this:

- drop_event:    when:      or:        # Exclude traces from Zipkin        - contains.path: "/api/v"        # Exclude Jolokia calls        - contains.path: "/jolokia/?"        # Exclude pinging metrics        - equals.path: "/metrics"        # Exclude pinging health        - equals.path: "/health"

elasticsearch logging kubernetes kibana filebeat

This worked for me in filebeat 6.1.3

        - drop_event.when:            or:            - equals:                kubernetes.container.name: "filebeat"            - equals:                kubernetes.container.name: "prometheus-kube-state-metrics"            - equals:                kubernetes.container.name: "weave-npc"            - equals:                kubernetes.container.name: "nginx-ingress-controller"            - equals:                kubernetes.container.name: "weave"

elasticsearch logging kubernetes kibana filebeat

I am using a different approach, that is less efficient in terms on the number of logs that transit in the logging pipeline.

Similarly on how you did, I deployed one instance of filebeat on my nodes, using a daemonset. Nothing special here, this is the configuration I am using:

apiVersion: v1data:  filebeat.yml: |-    filebeat.config:      prospectors:        # Mounted `filebeat-prospectors` configmap:        path: ${path.config}/prospectors.d/*.yml        # Reload prospectors configs as they change:        reload.enabled: false      modules:        path: ${path.config}/modules.d/*.yml        # Reload module configs as they change:        reload.enabled: false    processors:      - add_cloud_metadata:    output.logstash:      hosts: ['logstash.elk.svc.cluster.local:5044']kind: ConfigMapmetadata:  labels:    k8s-app: filebeat    kubernetes.io/cluster-service: "true"  name: filebeat-config

And this one for the prospectors:

apiVersion: v1data:  kubernetes.yml: |-    - type: log      paths:        - /var/lib/docker/containers/*/*.log      json.message_key: log      json.keys_under_root: true      processors:        - add_kubernetes_metadata:            in_cluster: true            namespace: ${POD_NAMESPACE}kind: ConfigMapmetadata:  labels:    k8s-app: filebeat    kubernetes.io/cluster-service: "true"  name: filebeat-prospectors

The Daemonset spec:

apiVersion: extensions/v1beta1kind: DaemonSetmetadata:  labels:    k8s-app: filebeat    kubernetes.io/cluster-service: "true"  name: filebeatspec:  selector:    matchLabels:      k8s-app: filebeat      kubernetes.io/cluster-service: "true"  template:    metadata:      labels:        k8s-app: filebeat        kubernetes.io/cluster-service: "true"    spec:      containers:      - args:        - -c        - /etc/filebeat.yml        - -e        command:        - /usr/share/filebeat/filebeat        env:        - name: POD_NAMESPACE          valueFrom:            fieldRef:              apiVersion: v1              fieldPath: metadata.namespace        image: docker.elastic.co/beats/filebeat:6.0.1        imagePullPolicy: IfNotPresent        name: filebeat        resources:          limits:            memory: 200Mi          requests:            cpu: 100m            memory: 100Mi        securityContext:          runAsUser: 0        volumeMounts:        - mountPath: /etc/filebeat.yml          name: config          readOnly: true          subPath: filebeat.yml        - mountPath: /usr/share/filebeat/prospectors.d          name: prospectors          readOnly: true        - mountPath: /usr/share/filebeat/data          name: data        - mountPath: /var/lib/docker/containers          name: varlibdockercontainers          readOnly: true      restartPolicy: Always      terminationGracePeriodSeconds: 30      volumes:      - configMap:          name: filebeat-config        name: config      - hostPath:          path: /var/lib/docker/containers          type: ""        name: varlibdockercontainers      - configMap:          defaultMode: 384          name: filebeat-prospectors        name: prospectors      - emptyDir: {}        name: data

Basically, all data from all logs from all containers gets forwarded to logstash, reachable at the service endpoint: logstash.elk.svc.cluster.local:5044 (service called "logstash" in the "elk" namespace).

For brevity, I'm gonna give you only the configuration for logstash (if you need more specific help with kubernetes, please ask in the comments):

The logstash.yml file is very basic:

http.host: "0.0.0.0"path.config: /usr/share/logstash/pipeline

Just indicating the mountpoint of the directory where I mounted the pipeline config files, which are the following:

10-beats.conf:declares an input for filebeat (port 5044 has to be exposed with a service called "logstash")

input {  beats {    port => 5044    ssl => false  }}

49-filter-logs.conf:this filter basically drops logs coming from pods that don't have the "elk" label. For the pods that do have the "elk" label, it keeps the logs from containers named in the "elk" label of the pod. For instance, if a Pod has two containers, called "nginx" and "python", putting a label "elk" with value "nginx" will only keep the logs coming from the nginx container and drop the python ones. The type of the log is set as the namespace the pod is running in.This might not be a good fit for everybody (you're going to have a single index in elasticsearch for all logs belonging to a namespace) but it works for me because my logs are homologous.

filter {    if ![kubernetes][labels][elk] {        drop {}    }    if [kubernetes][labels][elk] {        # check if kubernetes.labels.elk contains this container name        mutate {          split => { "kubernetes[labels][elk]" => "." }        }        if [kubernetes][container][name] not in [kubernetes][labels][elk] {          drop {}        }        mutate {          replace => { "@metadata[type]" => "%{kubernetes[namespace]}" }          remove_field => [ "beat", "host", "kubernetes[labels][elk]", "kubernetes[labels][pod-template-hash]", "kubernetes[namespace]", "kubernetes[pod][name]", "offset", "prospector[type]", "source", "stream", "time" ]          rename => { "kubernetes[container][name]" => "container"  }          rename => { "kubernetes[labels][app]" => "app"  }        }    }}

The rest of the configuration is about log parsing and is not relevant in this context.The only other important part is the output:

99-output.conf: Send data to elasticsearch:

output {  elasticsearch {    hosts => ["http://elasticsearch.elk.svc.cluster.local:9200"]    manage_template => false    index => "%{[@metadata][type]}-%{+YYYY.MM.dd}"    document_type => "%{[@metadata][type]}"  }}

Hope you got the point here.

PROs of this approach

Once deployed filebeat and logstash, as long as you don't need to parse a new type of log, you don't need to update filebeat nor logstash configuration in order to get a new log in kibana. You just need to add a label in the pod template.
All log files get dropped by default, as long as you don't explicitly put the labels.

CONs of this approach

ALL logs from ALL pods come through filebeat and logstash, and get dropped only in logstash. This is a lot of work for logstash and can be resource consuming depending on the number of pods you have in your cluster.

I am sure there are better approaches to this problem, but I think this solution is quite handy, at least for my use case.

CodeHunter

Filebeat Kubernetes Processor and filtering

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last