Log parser/analyzer in Unix Log parser/analyzer in Unix unix unix

Log parser/analyzer in Unix


I find it to be a huge failure that many log formats do not separate columns with proper unique field separators. Not because that is best, but because it is the basic premise of unix textutils that operate on table data. Instead they tend to use spaces as separators and quote fields that might contain spaces.

One of the most practical simple changes I made to web log analyzing was to leave the default NCSA log format produced by the nginx web server, to instead use tab as the field separator.

Suddenly I could use all of the primitive unix textutils for quick lookups, but especially awk! Print only lines where the user-agent field contains Googlebot:

awk 'BEGIN {FS="\t"}  $7 ~ /Googlebot/ { print; }' < logfile

Find the number of requests on for each unique request

awk 'BEGIN {FS="\t"}  { print $4; }' < logfile | sort | uniq -c | sort -n

And of course lots of combinations to find specific visitors.


For regular, nightly checking there is logwatch which have several different scripts in /usr/share/logwatch/scripts/services that check for specific things (like web server stuff, ftp server stuff, sshd related stuff, etc) in syslog. Default install enables most of them, but you are able to enable/disable as you like or even write your own scripts.

For real-time watching there is multitail.


You might want to try out lnav, a curses based log analyzer. It has most of the features you would expect from a log parser like, chronological arrangement of log messages from multiple log files, support for multiple log formats, highlighting of error/warning messages, hotkeys for navigating between error/warning messages, support for SQL queries and lots more. Take a look at the project's website for screenshots and a detailed list of features.