Parse logs in fluentd
For anyone having similar issue i found a solution that works for me.
In fluent.conf file new filter tags are added. For instance, if I want to make new field called severity the first step is to record it with regex.
Example is [DEBU].
<filter *.*> @type record_transformer enable_ruby <record> severity ${record["log"].scan(/\[([^\)]+)\]/).last} </record></filter>
And is afterwards deleted from original message:
<filter *.*> @type record_transformer enable_ruby <record> log ${record["log"].gsub(/\[([^\)]+)\]/, '')} </record></filter>
The main part is:
severity ${record["log"].scan(/\[([^\)]+)\]/).last}
Where severity is name of the new field, record["log"] is original log string where string via regex is found and appended to the new field.
log ${record["log"].gsub(/\[([^\)]+)\]/, '')}
This command modifies field log where regex is substitued by empty string - deleted.
NOTE: Order is important since we first have to append to the new field and then delete string from the original log message (if needed).
First, tag your sources using tag
. Second, in the match section include your tag key:
include_tag_key true tag_key fluentd_key
This works for me. The logs would be categorized by fluentd_key
.
we can use record_transformer option. like in below config:
<filter kubernetes.**> @type record_transformer enable_ruby true <record> container_name ${record["kubernetes"]["container_name"]} namespace ${record["kubernetes"]["namespace_name"]} pod ${record["kubernetes"]["pod_name"]} host ${record["kubernetes"]["host"]} </record> </filter>
by this we can have container_name, namespace, pod and host as labels/tags. which we then can use further. below is the one of the sample use case.
<match **> @type elasticsearch host "#{ENV['FLUENT_ELASTICSEARCH_HOST']}" port "#{ENV['FLUENT_ELASTICSEARCH_PORT']}" logstash_format true logstash_prefix ${namespace}_${container_name} <buffer tag, container_name, namespace> @type file path /var/log/${container_name}/app.log </buffer> </match>