Hbase for real-time application

hadoop hbase lambda-architecture bigdata

HBase is chosen based on the following in general:

Volume : millions and billions is better than thousands and millions

Features : When you do not need transactions, secondary indexes and some RDBMS features

Hardware : Make sure you have sufficient hardware for region servers. It involves good amount of maintenance

More specific:

Its best suited for web applications due to its fast random read queries. But this only comes with very good row key design. This involves you planning out your end queries well in advance and design your row key. Special care needs to be take in row key desing if you also have time based data and your queries heavily depend on it. In short, you should avoid hot spotting. Some info here

Apart from this, selection by other columns values is possible using HBase filters, but very few selections and may not guarantee the web apps response times.

Also, if your data set(rows) have variable number of columns and also you do not need all columns in your queries, HBase is again the best choice

Server(Region) failover is possible in HBase - so your data would be safe.

It can be used both for batch and streaming. Ofcourse, for streaming its the best possible in Big Data stack. However this also depends on your streaming pipeline - like kafka, spark streaming or storm etc.

Since you mentioned Phoenix, I assume you might want to stick to sql view of HBase - this might give you better options. However at the core, row key design is still at the heart of HBase performance

CodeHunter

Hbase for real-time application

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last