Running Apache Hadoop 2.1.0 on Windows Running Apache Hadoop 2.1.0 on Windows hadoop hadoop

Running Apache Hadoop 2.1.0 on Windows


I have followed following steps to install Hadoop 2.2.0

Steps to build Hadoop bin distribution for Windows

  1. Download and install Microsoft Windows SDK v7.1.

  2. Download and install Unix command-line tool Cygwin.

  3. Download and install Maven 3.1.1.

  4. Download Protocol Buffers 2.5.0 and extract to a folder (say c:\protobuf).

  5. Add Environment Variables JAVA_HOME, M2_HOME and Platform if not added already. Note : Variable name Platform is case sensitive. And value will be either x64 or Win32 for building on a 64-bit or 32-bit system. Edit Path Variable to add bin directory of Cygwin (say C:\cygwin64\bin), bin directory of Maven (say C:\maven\bin) and installation path of Protocol Buffers (say c:\protobuf).

  6. Download hadoop-2.2.0-src.tar.gz and extract to a folder having short path (say c:\hdfs) to avoid runtime problem due to maximum path length limitation in Windows.

  7. Select Start --> All Programs --> Microsoft Windows SDK v7.1 and open Windows SDK 7.1 Command Prompt. Change directory to Hadoop source code folder (c:\hdfs). Execute mvn package with options -Pdist,native-win -DskipTests -Dtar to create Windows binary tar distribution.

  8. If everything goes well in the previous step, then native distribution hadoop-2.2.0.tar.gz will be created inside C:\hdfs\hadoop-dist\target\hadoop-2.2.0 directory.

Install Hadoop

  1. Extract hadoop-2.2.0.tar.gz to a folder (say c:\hadoop).

  2. Add Environment Variable HADOOP_HOME and edit Path Variable to add bin directory of HADOOP_HOME (say C:\hadoop\bin).

Configure Hadoop

C:\hadoop\etc\hadoop\core-site.xml

<configuration>        <property>                <name>fs.defaultFS</name>                <value>hdfs://localhost:9000</value>        </property></configuration>

C:\hadoop\etc\hadoop\hdfs-site.xml

<configuration>        <property>                <name>dfs.replication</name>                <value>1</value>        </property>        <property>                <name>dfs.namenode.name.dir</name>                <value>file:/hadoop/data/dfs/namenode</value>        </property>        <property>                <name>dfs.datanode.data.dir</name>                <value>file:/hadoop/data/dfs/datanode</value>        </property></configuration>

C:\hadoop\etc\hadoop\mapred-site.xml

<configuration>        <property>           <name>mapreduce.framework.name</name>           <value>yarn</value>        </property></configuration>

C:\hadoop\etc\hadoop\ yarn-site.xml

<configuration>        <property>           <name>yarn.nodemanager.aux-services</name>           <value>mapreduce_shuffle</value>        </property>        <property>           <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>           <value>org.apache.hadoop.mapred.ShuffleHandler</value>        </property></configuration>

Format namenode

For the first time only, namenode needs to be formatted.

C:\Users\abhijitg>cd c:\hadoop\bin c:\hadoop\bin>hdfs namenode –format

Start HDFS (Namenode and Datanode)

C:\Users\abhijitg>cd c:\hadoop\sbinc:\hadoop\sbin>start-dfs

Start MapReduce aka YARN (Resource Manager and Node Manager)

C:\Users\abhijitg>cd c:\hadoop\sbinc:\hadoop\sbin>start-yarnstarting yarn daemons

Total four separate Command Prompt windows will be opened automatically to run Namenode, Datanode, Resource Manager, Node Manager

Reference : Build, Install, Configure and Run Apache Hadoop 2.2.0 in Microsoft Windows OS


Han has prepared the Hadoop 2.2 Windows x64 binaries (see his blog) and uploaded them to Github.

After putting the two binaries winutils.exe and hadoop.dll into the %hadoop_prefix%\bin folder, I got the same UnsatisfiedLinkError.

The problem was that some dependency of hadoop.dll was missing. I used Dependency Walker to check the dependencies of the binaries and the Microsoft Visual C++ 2010 Redistributables were missing.

So besides building all the components yourself, the answer to the problem is

  • make sure to use the same architecture for Java and the native code. java -version tells you if you use 32 or x64.
  • then use Dependency Walker to make sure all native binaries are pure and of the same architecture. Sometimes a x64 dependency is missing and Windows falls back to x86, which does not work. See answer of another question.
  • also check if all dependencies of the native binaries are satisfied.


I had the same problem but with recent hadoop v. 2.2.0. Here are my steps for solving that problem:

  1. I've built winutils.exe from sources. Project directory:

    hadoop-2.2.0-src\hadoop-common-project\hadoop-common\src\main\winutils

    My OS: Windows 7. Tool for building: MS Visual Studio Express 2013 for Windows Desktop (it's free and can be loaded from http://www.microsoft.com/visualstudio/).Open Studio, File -> Open -> winutils.sln. Right click on solution on the right side -> Build.There were a couple errors in my case (you might need to fix project properties, specify output folder).Viola! You get winutils.exe - put it into hadoop's bin.

  2. Next we need to build hadoop.dll.Some woodoo magic here goes: open

    hadoop-2.2.0-src\hadoop-common-project\hadoop-common\src\main\native\native.sln

    in MS VS; right click on solution -> build.I got a bunch of errors. I created manually several missed header files (don't ask me why they are missed in source tarball!):

    https://github.com/jerishsd/hadoop-experiments/tree/master/sources

    (and don't ask me what this project on git is for! I don't know - google pointed it out by searching header file names)I've copied

    hadoop-2.2.0-src\hadoop-common-project\hadoop-common\target\winutils\Debug\libwinutils.lib

    (result of step # 1) into

    hadoop-2.2.0-src\hadoop-common-project\hadoop-common\target\bin

    And finally build operation produces hadoop.dll!Put it again into hadoop's bin and happily run namenode!

Hope my steps will help somebody.