New MapReduce Architecture and Eclipse

eclipse architecture hadoop mrv2

Nourl Wait for https://issues.apache.org/jira/browse/MAPREDUCE-3131 to complete. Any way you can check out the revision and try running that.

You will need to mvn site:site to generate a document, Which has all the documentations. And inorder to figure out how? you can either open scripts debug.sh and see for yourself.

Basically we are passing the JAVA_OPTIONS and specifying eclipse remote debug parameters. It gets tricky for child processes, as for that one needs to specify a property mapred.child.java.opts.

HTH

-P

eclipse architecture hadoop mrv2

I have try to run YARN (The next generation of mapreduce) on my host for several days.

Firstly, get the source code from apache.org using svn or git. take svn for example:

svn co https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0

then, generate eclipse related files using maven (you should configure manve3 on your host before this step.)

mvn test -DskipTestsmvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true

and now you could import existing maven project into eclipse.(you should configure maven plugin in eclipse first.)

In eclipse: File-> Import existing Maven projects

Choose "Existing Projects into Workspace"Select the hadoop-mapreduce-project directory as the root directorySelect the hadoop-mapreduce-project projectClick "Finish"

I have try many times due to class_path/build_path was not correctly configured and not include all dependency package/class. Try to "Add External Class Folder" and select the build directory of the current project Under project Properties if you meet the same problem as me.

update:2012-03-15

I could run YARN(the same as Hadoop0.23) in eclipse now.

Firstly, you should compile/build Yarn Successfully by exec command:

mvn clean package -Pdist -Dtar -DskipTests

For the reason that I only care about how to debug YARN, I run HDFS on my single host in the linux terminal,not in eclipse.

bin/hdfs namenode -formate -clusterid your_hdfs_idsbin/hadoop-daemon.sh start namenodesbin/hadoop-daemon.sh start datanode

and then, import hadoop 0.23 into eclipse and find resourcemanager.java, the next step is to run this class in eclipse. Detail steps:

right click, and select run as application
add new configuration to run this class, in the arguments part, fill in with content:
--config your_yarn_conf_dir (the same as HDFS conf dir)
click run button, you will find resourcemanager output in eclipse console.

Running Nodemanaer in eclipse is the same as running Resourcemanager. Add new configuration and fill argumemts with "--config your_yarn_conf_dir", then press run button.

Happy Coding~!

CodeHunter

New MapReduce Architecture and Eclipse

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last