Apache Storm compared to Hadoop

Why don't you tell your opinion.

Twitter Storm has been touted as real time Hadoop. That is more a marketing take for easy consumption.

They are superficially similar since both are distributed application solutions. Apart from the typical distributed architectural elements like master/slave, zookeeper based coordination, to me comparison falls off the cliff.

Twitter is more like a pipline for processing data as it comes. The pipe is what connects various computing nodes that receive data, compute and deliver output. (There lingo is spouts and bolts) Extend this analogy to a complex pipeline wiring that can be re-engineered when required and you get Twitter Storm.

In nut shell it processes data as it comes. There is no latency.

Hadoop how ever is different in this respect primarily due to HDFS. It a solution geared to distributed storage and tolerance to outage of many scales (disks, machines, racks etc)

M/R is built to leverage data localization on HDFS to distribute computational jobs. Together, they do not provide facility for real time data processing. But that is not always a requirement when you are looking through large data. (needle in the haystack analogy)

In short, Twitter Storm is a distributed real time data processing solution. I don't think we should compare them. Twitter built it because it needed a facility to process small tweets but humungous number of them and in real time.

See: HStreaming if you are compelled to compare it with some thing

hadoop streaming apache-storm

Basically, both of them are used for analyzing big data, but Storm is used for real time processing while Hadoop is used for batch processing.

This is a very good introduction to Storm that I found: Click here

hadoop streaming apache-storm

Rather than to be compared, they are supposed to supplement each other now having batch + real-time (pseudo-real time) processing. There is a corresponding video presentation - Ted Dunning on Twitter's Storm

CodeHunter

Apache Storm compared to Hadoop

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last