Log viewing utility database choice Log viewing utility database choice hadoop hadoop

Log viewing utility database choice


My logs are very structured :)

I would say you don't need database you need search engine:

  • Solr based on Lucene and it packages everything what you need together
  • ElasticSearch another Lucene based search engine
  • Sphinx nice thing is that you can use multiple sources per search index -- enrich your raw logs with other events
  • Scribe Facebook way to search and collect logs

Update for @JustBob:Most of the mentioned solutions can work with flat file w/o affecting performance. All of then need inverted index which is the hardest part to build or maintain. You can update index in batch mode or on-line. Index can be stored in RDBMS, NoSQL, or custom "flat file" storage format (custom - maintained by search engine application)


You can find a lot of information here:

http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

See which fits your needs.

Anyway for such a task NoSQL is the right choice.


You should also consider the learning curve, MongoDB / CouchDB, even though they don't perform such as Cassandra or Hadoop, they are easier to learn.

MongoDB being used by Craigslist to store old archives: http://www.10gen.com/presentations/mongodb-craigslist-one-year-later