Hadoop Basics: What do I do with the output? Hadoop Basics: What do I do with the output? hadoop hadoop

Hadoop Basics: What do I do with the output?


At foursquare I'm using Hive's Thrift driver to put the data into databases/spreadsheets as needed.

I maintain a job server that executes jobs via the Hive driver and then moves the output wherever it is needed. Using thrift directly is very easy and allows you to use any programming language.

If you're dealing with hadoop directly (and can't use this) you should check out Sqoop, built by Cloudera

Sqoop is designed for moving data in batch (whereas Flume is designed for moving it in real-time, and seems more aligned with putting data into hdfs than taking it out).

Hope that helps.