Why Hadoop is tightly bound to linux? Why Hadoop is tightly bound to linux? hadoop hadoop

Why Hadoop is tightly bound to linux?


Are you aware of the Hadoop work on which Microsoft and Hortonworks are collaborating, essentially committing changes to the Apache project for native Windows support?

The project is still in a preview phase, with Hadoop on Azure being the first part of the rollout. This is Hadoop running on Windows Server 2008R2 in the Windows Azure cloud. It will also be available for installation on premises for building your own clusters.

I'd recommend learning more and signing up for the program, since you'd be recreating what they've already spent man years on.


According to their Quick Start page, it hasn't been well-tested yet.

  • GNU/Linux is supported as a development and production platform. Hadoop has been demonstrated on GNU/Linux clusters with 2000 nodes.
  • Win32 is supported as a development platform. Distributed operation has not been well tested on Win32, so it is not supported as a production platform.

Windows has much better remote management support than most people realize, but it's still tough to beat Linux when it comes to the ease (and price tag) of setting up a large compute farm. This is just a guess, but perhaps it's less likely that researchers who need to build such massive clusters want to put much of their budget toward OS licensing.


Tne validated answer is from 2012.Here are the latest news from the Hadoop Wiki as of 2017

  • Hadoop version 2.2 onwards includes native support for Windows. The official Apache Hadoop releases do not include Windows binaries (yet, as of January 2014). However building a Windows package from the sources is fairly straightforward.