Recommended Setup for BigData Application

solr elasticsearch cassandra hive apache-pig

First let's talk about Cassandra

This is a NoSQL database with eventual consistency which basically means for you that different nodes into a Cassandra cluster may have different 'snapshots' of data in the case that there is an inter cluster communication/availability problem. The data eventually will be consistent however.

Since you consider it as a 'frontend' database what you need to understand is how you will model your data. Cassandra can take advantage of indexes however you still need to defined upfront your access pattern.

Normally there is no relation between Cassandra and Hadoop (except that both are written in Java) however the Datastax distribution (enterprise version) has Hadoop support directly from Cassandra.

As a general workflow you will read/write most current data (let's say - last 24 hours) from your 'small' database that enough performance (Cassandra has excellent support for it) and you would move anything older than X (older than 24 hours) to a 'long term storage' such as Hadoop where you can run all sort of Map Reduce etc.

In regards to the text search it really depends what you need - Elastic Search is sort of competition to Solr and reverse. You can see yourself how they compare here http://solr-vs-elasticsearch.com/

solr elasticsearch cassandra hive apache-pig

As for your third question,

I think Cassandra is more like a database to save data.

Hadoop is responsible to provide a compution model to let you analyze your large data in Cassandra.So it is very helpful to combine Cassandra with Hadoop.

Also have other ways you can consider, such as combine with mongo and hadoop, for mongo has support mongo-connector between hadoop and it's data.

Also if you have some search requirements , you can also use solr, directly generated index from mongo.

CodeHunter

Recommended Setup for BigData Application

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last