New MapReduce Architecture and Eclipse New MapReduce Architecture and Eclipse hadoop hadoop

New MapReduce Architecture and Eclipse


Nourl Wait for https://issues.apache.org/jira/browse/MAPREDUCE-3131 to complete. Any way you can check out the revision and try running that.

You will need to mvn site:site to generate a document, Which has all the documentations. And inorder to figure out how? you can either open scripts debug.sh and see for yourself.

Basically we are passing the JAVA_OPTIONS and specifying eclipse remote debug parameters. It gets tricky for child processes, as for that one needs to specify a property mapred.child.java.opts.

HTH

-P


I have try to run YARN (The next generation of mapreduce) on my host for several days.

Firstly, get the source code from apache.org using svn or git. take svn for example:

svn co https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0

then, generate eclipse related files using maven (you should configure manve3 on your host before this step.)

mvn test -DskipTestsmvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true

and now you could import existing maven project into eclipse.(you should configure maven plugin in eclipse first.)

In eclipse: File-> Import existing Maven projects

Choose "Existing Projects into Workspace"Select the hadoop-mapreduce-project directory as the root directorySelect the hadoop-mapreduce-project projectClick "Finish"

I have try many times due to class_path/build_path was not correctly configured and not include all dependency package/class. Try to "Add External Class Folder" and select the build directory of the current project Under project Properties if you meet the same problem as me.


update:2012-03-15

I could run YARN(the same as Hadoop0.23) in eclipse now.

Firstly, you should compile/build Yarn Successfully by exec command:

mvn clean package -Pdist -Dtar -DskipTests

For the reason that I only care about how to debug YARN, I run HDFS on my single host in the linux terminal,not in eclipse.

bin/hdfs namenode -formate -clusterid your_hdfs_idsbin/hadoop-daemon.sh start namenodesbin/hadoop-daemon.sh start datanode

and then, import hadoop 0.23 into eclipse and find resourcemanager.java, the next step is to run this class in eclipse. Detail steps:

  • right click, and select run as application
  • add new configuration to run this class, in the arguments part, fill in with content:

    --config your_yarn_conf_dir (the same as HDFS conf dir)

  • click run button, you will find resourcemanager output in eclipse console.

Running Nodemanaer in eclipse is the same as running Resourcemanager. Add new configuration and fill argumemts with "--config your_yarn_conf_dir", then press run button.

Happy Coding~!