Is MongoDB a valid alternative to relational db + lucene? [closed] Is MongoDB a valid alternative to relational db + lucene? [closed] mongodb mongodb

Is MongoDB a valid alternative to relational db + lucene? [closed]


Technically you can do full text search with MongoDB, but you're missing out on a lot that a full text search provider has to offer. I love MongoDB, but I'd couple it with a full text search provider (such as Lucene or Sphinx) if time to implementation is at all a concern. I think MongoDB's convenient ability to index word arrays is better left to tagging and searching based on tagging than full text search.

Search (Information Retrieval) isn't just about grabbing any documents that match, if you want your search results to have any relevance at all you're going to need something along the lines of TF-IDF, phrase matching (words in a sequence score higher) or any number of other IR techniques to improve search precision. If you use MongoDB you'll need to implement it all from scratch.

If you really want to implement it all from scratch but not bother with the raw storage side of things, MongoDB is pretty close to the best DB store that you could implement it on top of (can't think of many others), but that still doesn't make it a great option.


CouchDb seems to be a(n other) possible alternative to use Lucene via couchdb-lucene project.


MongoDb is an NOSQl, Lucene and SOLR are search engines, and adding another thing to the comparison is caches like Terracota along with EhCache. All have thier own purpose.

If searching along with full text search is required with stemming, relevancy settings like showing results with text matching in product title ranking more than text matching in desctription, and many such text based features. Also ranking, relevancy, sound alike macthing, partial word matching etc etc . All this things are best handled by search based storage systems like SOLR and Lucene.

If your criteria is fater retrieval only and you dont need your presentation data objects to be durable then simply use a cache lke Terracota.

If you need faster retrieval and also need to colloborate and aggregate data in one datasource and also need that aggregated data to be durable then use NOSQL like Mongodb.