When NOT to use Cassandra? [closed] When NOT to use Cassandra? [closed] database database

When NOT to use Cassandra? [closed]


There is nothing like a silver bullet, everything is built to solve specific problems and has its own pros and cons. It is up to you, what problem statement you have and what is the best fitting solution for that problem.

I will try to answer your questions one by one in the same order you asked them. Since Cassandra is based on the NoSQL family of databases, it's important you understand why use a NoSQL database before I answer your questions.

Why use NoSQL

In the case of RDBMS, making a choice is quite easy because all the databases like MySQL, Oracle, MS SQL, PostgreSQL in this category offer almost the same kind of solutions oriented toward ACID properties. When it comes to NoSQL, the decision becomes difficult because every NoSQL database offers different solutions and you have to understand which one is best suited for your app/system requirements. For example, MongoDB is fit for use cases where your system demands a schema-less document store. HBase might be fit for search engines, analyzing log data, or any place where scanning huge, two-dimensional join-less tables is a requirement. Redis is built to provide In-Memory search for varieties of data structures like trees, queues, linked lists, etc and can be a good fit for making real-time leaderboards, pub-sub kind of system. Similarly there are other databases in this category (Including Cassandra) which are fit for different problem statements. Now lets move to the original questions, and answer them one by one.

When to use Cassandra

Being a part of the NoSQL family, Cassandra offers a solution for problems where one of your requirements is to have a very heavy write system and you want to have a quite responsive reporting system on top of that stored data. Consider the use case of Web analytics where log data is stored for each request and you want to built an analytical platform around it to count hits per hour, by browser, by IP, etc in a real time manner. You can refer to this blog post to understand more about the use cases where Cassandra fits in.

When to Use a RDMS instead of Cassandra

Cassandra is based on a NoSQL database and does not provide ACID and relational data properties. If you have a strong requirement for ACID properties (for example Financial data), Cassandra would not be a fit in that case. Obviously, you can make a workaround for that, however you will end up writing lots of application code to simulate ACID properties and will lose on time to market badly. Also managing that kind of system with Cassandra would be complex and tedious for you.

When not to use Cassandra

I don't think it needs to be answered if the above explanation makes sense.


When evaluating distributed data systems, you have to consider the CAP theorem - you can pick two of the following: consistency, availability, and partition tolerance.

Cassandra is an available, partition-tolerant system that supports eventual consistency. For more information see this blog post I wrote: Visual Guide to NoSQL Systems.


Cassandra is the answer to a particular problem: What do you do when you have so much data that it does not fit on one server ? How do you store all your data on many servers and do not break your bank account and not make your developers insane ? Facebook gets 4 Terabyte of new compressed data EVERY DAY. And this number most likely will grow more than twice within a year.

If you do not have this much data or if you have millions to pay for Enterprise Oracle/DB2 cluster installation and specialists required to set it up and maintain it, then you are fine with SQL database.

However Facebook no longer uses cassandra and now uses MySQL almost exclusively moving the partitioning up in the application stack for faster performance and better control.