Parse nginx ingress logs in fluentd Parse nginx ingress logs in fluentd kubernetes kubernetes

Parse nginx ingress logs in fluentd


Pipelines are quite different in logstash and fluentd. And it took some time to build working Kubernetes -> Fluentd -> Elasticsearch -> Kibana solution.

Short answer to my question is to install fluent-plugin-parser plugin (I wonder why it doesn't ship within standard package) and put this rule after kubernetes_metadata filter:

<filter kubernetes.var.log.containers.nginx-ingress-controller-**.log>  type parser  format /^(?<host>[^ ]*) (?<domain>[^ ]*) \[(?<x_forwarded_for>[^\]]*)\] (?<server_port>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+[^\"])(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")? (?<request_length>[^ ]*) (?<request_time>[^ ]*) (?:\[(?<proxy_upstream_name>[^\]]*)\] )?(?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*)$/  time_format %d/%b/%Y:%H:%M:%S %z  key_name log  types server_port:integer,code:integer,size:integer,request_length:integer,request_time:float,upstream_response_length:integer,upstream_response_time:float,upstream_status:integer  reserve_data yes</filter>

Long answer with lots of examples is here: https://github.com/kayrus/elk-kubernetes/


<match fluent.**>  @type null</match><source> @type tail path /var/log/containers/nginx*.log pos_file /data/fluentd/pos/fluentd-nginxlog1.log.pos tag nginxlogs format none read_from_head true</source><filter nginxlogs>  @type parser  format json  key_name message</filter><filter nginxlogs>  @type parser  format /^(?<host>[^ ]*) (?<domain>[^ ]*) \[(?<x_forwarded_for>[^\]]*)\] (?<server_port>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+[^\"])(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) (?:\[(?<proxy_upstream_name>[^\]]*)\] )?(?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) \w*$/  time_format %d/%b/%Y:%H:%M:%S %z  key_name log#  types server_port:integer,code:integer,size:integer,request_length:integer,request_time:float,upstream_response_length:integer,upstream_response_time:float,upstream_status:integer</filter><match nginxlogs>  @type stdout</match>


You can use multi-format-parser plugin, https://github.com/repeatedly/fluent-plugin-multi-format-parser

 <match>   format multi_format   <pattern>     format json   </pattern>   <pattern>     format regexp...     time_key timestamp   </pattern>   <pattern>     format none   </pattern> </match>

Note: I'm curious to what was the final conf looks like including the filter parser.