Row count of a column family in Cassandra Row count of a column family in Cassandra database database

Row count of a column family in Cassandra


If you are working on a large data set and are okay with a pretty good approximation, I highly recommend using the command:

nodetool --host <hostname> cfstats

This will dump out a list for each column family looking like this:

Column Family: widgetsSSTable count: 11Space used (live): 4295810363Space used (total): 4295810363Number of Keys (estimate): 9709824Memtable Columns Count: 99008Memtable Data Size: 150297312Memtable Switch Count: 434Read Count: 9716802Read Latency: 0.036 ms.Write Count: 9716806Write Latency: 0.024 ms.Pending Tasks: 0Bloom Filter False Postives: 10428Bloom Filter False Ratio: 1.00000Bloom Filter Space Used: 18216448Compacted row minimum size: 771Compacted row maximum size: 263210Compacted row mean size: 1634

The "Number of Keys (estimate)" row is a good guess across the cluster and the performance is a lot faster than explicit count approaches.


If you are using an order-preserving partitioner, you can do this with get_range_slice or get_key_range.

If you are not, you will need to store your user ids in a special row.


I found an excellent article on this here.. http://www.planetcassandra.org/blog/post/counting-keys-in-cassandra

select count(*) from cf limit 1000000

Above statement can be used if we have an approximate upper bound known before hand. I found this useful for my case.