logstash: how to include input file line number logstash: how to include input file line number elasticsearch elasticsearch

logstash: how to include input file line number


Are the log files generated by you? Or can you change the log structure? Then you can add a counter as a prefix and filter it out with logstash.

For example for

12345 2018-02-25 22:37:55 [mylibrary] INFO: this is an example log line

your filter must look like this:

filter {   grok {     match => {"message" => "%{INT:count} %{GREEDYDATA:message}"     overwrite => ["message"]   }}

New field "count" will be created. You can then possibly use it for your purposes.


At this moment, I don't think there are any solutions here. Logstash, Beats, Kibana all have the idea of events over time and that's basically the way things are ordered. Line numbers are more of a text editor kind of functionality.

To a certain degree Kibana can show you the events in a file. It won't give you a page by page kind of list where you can actually click on a page number, but using time frames you could theoretically look at an entire file.

There are similar requests (enhancements) for Beats and Logstash.


First let me give what is probably the main reason why Filebeat doesn't already have a line number field. When Filebeat resumes reading a file (like after a restart) it does an fseek to resume from the last recorded offset. If it had to report the line numbers it would either need to store this state in its registry or re-read the file and count newlines up to the offset.

If you want to offer a service that allows you to paginate through the logs that are backed by Elasticsearch you can use the scroll API with a query for the file. You must sort the results by @timestamp and then by offset. Your service would use a scroll query to get the first page of results.

POST /filebeat-*/_search?scroll=1m{  "size": 10,  "query": {    "match": {      "source": "/var/log/messages"    }  },  "sort": [    {      "@timestamp": {        "order": "asc"      }    },    {      "offset": "asc"    }  ]}

Then to get all future pages you use the scroll_id returned from the first query.

POST  /_search/scroll{    "scroll" : "1m",    "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoBwAAAAAAPXDOFk12OEYw="}

This will give you all log data for a given file name even tracking it across rotations. If line numbers are critical you could produce them synthetically by counting events starting with the first event that has offset == 0, but I avoid this because it's very error prone especially if you ever add any filtering or multiline grouping.