Kibana - How to extract fields from existing Kubernetes logs Kibana - How to extract fields from existing Kubernetes logs kubernetes kubernetes

Kibana - How to extract fields from existing Kubernetes logs


After a few days of research and getting accustomed to the EFK stack, I arrived to an EFK specific solution, as opposed to that in Darth_Vader's answer, which is only possible on the ELK stack.

So to summarize, I am using Fluentd instead of Logstash, so any grok solution would work if you also install the Fluentd Grok Plugin, which I decided not to do, because:

As it turns out, Fluentd has its own field extraction functionality through the use of parser filters. To solve the problem in my question, right before the <match **> line, so after the log line object was already enriched with kubernetes metadata fields and labels, I added the following:

<filter kubernetes.var.log.containers.webapp-**.log>  type parser  key_name log  reserve_data yes  format /^(?<ip>[^-]*) - \[(?<datetime>[^\]]*)\] host="(?<hostname>[^"]*)" req="(?<method>[^ ]*) (?<uri>[^ ]*) (?<http_version>[^"]*)" status=(?<status_code>[^ ]*) body_bytes=(?<body_bytes>[^ ]*) referer="(?<referer>[^"]*)" user_agent="(?<user_agent>[^"]*)" time=(?<req_time>[^ ]*)/</filter>

To explain:

<filter kubernetes.var.log.containers.webapp-**.log> - apply the block on all the lines matching this label; in my case the containers of the web server component are called webapp-{something}

type parser - tells fluentd to apply a parser filter

key_name log - apply the pattern only on the log property of the log line, not the whole line, which is a json string

reserve_data yes - very important, if not specified the whole log line object is replaced by only the properties extracted from format, so if you already have other properties, like the ones added by the kubernetes_metadata filter, these are removed when not adding the reserve_data option

format - a regex that is applied on the value of the log key to extract named properties

Please note that I am using Fluentd 1.12, so this syntax is not fully compatible with the newer 1.14 syntax, but the principle will work with minor tweaks to the parser declaration.


In order to extract a log line into fields, you might have to use the grok filter. What you can do is to have a regex pattern, to match the exact part of the log line you needed. Grok filter could look something like this:

grok {    patterns_dir => ["pathto/patterns"]    match => { "message" => "^%{LOGTIMESTAMP:logtimestamp}%{GREEDYDATA:data}" }         }                                                 ^-----------------------^ are the fields you would see in ES when log is being indexed

----------------------------------------------------^ LOGTIMESTAMP should be defined in your patterns file something like:

LOGTIMESTAMP %{YEAR}%{MONTHNUM}%{MONTHDAY} %{TIME}

Once you have the matched fields, then you could simply use them for filtering purposes or you could still leave it as it is, if the main cause it to extract the fields from a log line.

if "something" in [message]{     mutate {         add_field => { "new_field" => %{logtimestamp} }     }          }

The above is just a sample so that you could reproduce it to suit your needs. You could use this tool, in order to test your patterns along with the string you wanted to match!

Blog post, could be handy! Hope this helps.