MongoDB logging MongoDB logging mongodb mongodb

MongoDB logging


I've seen at lot of companies are using MongoDB to store logs. Its schema-freeness is really flexible for application logs, at which schema tends to change time-to-time. Also, its Capped Collection feature is really useful because it automatically purges old data to keep the data fit into the memory.

People aggregates the logs by normal Grouping or MapReduce, but it's not that fast. Especially MongoDB's MapReduce only works within a single thread and its JavaScript execution overhead is huge. New aggregation framework could solve this problem.

Another concern is high write through put. Although MongoDB's insert is fire-and-forget style by default, calling a lot of insert command causes a heavy write lock contention. This could affect the application performance, and prevent the readers to aggregate / filter the stored logs.

One solution might be using the log collector framework such as Fluentd, Logstash, or Flume. These daemons are supposed to be launched at every application nodes, and takes the logs from app processes.

fluentd plus mongodb

They buffer the logs and asynchronously writes out the data to other systems like MongoDB / PostgreSQL / etc. The write is done by batches, so it's a lot more efficient than writing directly from apps. This link describes how to put the logs into Fluentd from Perl program.


I use it in several applications through Log::Dispatch::MongoDB; works like a charm!

# Declarationuse Log::Dispatch;use Log::Dispatch::MongoDB;use Log::Dispatch::Screen;use Moose;has log => (is => 'ro', isa => 'Log::Dispatch', default => sub { Log::Dispatch->new }, lazy => 1)...# Configuration$self->log->add(    Log::Dispatch::Screen->new(        min_level   => 'debug',        name        => 'screen',        newline     => 1,    ));$self->log->add(    Log::Dispatch::MongoDB->new(        collection  => MongoDB::Connection->new(            host    => $self->config->mongodb        )->saveme->log,        min_level   => 'debug',        name        => 'crawler',    ));...# The logging facility$self->log->log(    level   => 'info',    message => 'Crawler finished',    info    => {        origin  => $self->origin,        country => $self->country,        counter => $self->counter,        start   => $self->start,        finish  => time,    });

And here is a sample record from the capped collection:

{    "_id" : ObjectId("50c453421329307e4f000007"),    "info" : {            "country" : "sa",            "finish" : NumberLong(1355043650),            "origin" : "onedayonly_sa",            "counter" : NumberLong(2),            "start" : NumberLong(1355043646)    },    "level" : "info",    "name" : "crawler",    "message" : "Crawler finished"}


I've done this on a webapp that runs on two app servers. Writes in mongodb are non-blocking by default (the java driver just gets the request for you and returns back immediately, I assume it's the same for perl, but you better check) which is perfect for this use case since you don't want your users to wait for a log to be recorded.

The downside of this is that in certain failure scenarios you might lose some logs (your app fails before mongo gets the data for example).