How consistency works in HBase How consistency works in HBase hadoop hadoop

How consistency works in HBase


Access to row data is atomic and includes any number of columns being read or writtento. There is no further guarantee or transactional feature that spans multiple rows oracross tables. The atomic access is a factor to this architecture beingstrictly consistent, as each concurrent reader and writer can make safe assumptionsabout the state of a row.

When data is updated it is first written to a commit log, called a write-ahead log (WAL)in HBase, and then stored in the (sorted by RowId) in-memory memstore. Once the data in memory hasexceeded a given maximum value, it is flushed as an HFile to disk. After the flush, thecommit logs can be discarded up to the last unflushed modification.

Thus a lock is needed only to protect the row in RAM.


The answer provided by Evgeny is correct but very incomplete.
Contrary to what you wrote, there are many resources and blog articles and good material concerning this specific aspect. The tricky part is to aggregate separate information and make your own synthesis.Consistency is dealt with in HBase at many levels, and you need to understand those different levels to get a good global understanding of how it is managed.
HBase it a complex beast, give it time.

You can start by reading about Read/Write Path, Timeline-consistent High Available Reads, and Region Replication.

https://hbase.apache.org/book.html#arch.timelineconsistent.reads

https://mapr.com/blog/in-depth-look-hbase-architecture/