How to debug hadoop mapreduce jobs from eclipse?

java eclipse debugging hadoop remote-debugging

Make changes in /bin/hadoop (hadoop-env.sh) script. Check to see what command has been fired. If the command is jar, then only add remote debug configuration.

if [ "$COMMAND" = "jar" ] ; then  exec "$JAVA" -Xdebug -Xrunjdwp:transport=dt_socket,server=y,address=8999 $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"else  exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"fi

java eclipse debugging hadoop remote-debugging

The only way you can debug hadoop in eclipse is running hadoop in local mode. The reason being, each map reduce task run in ist own JVM and when you don't hadoop in local mode, eclipse won't be able to debug.

When you set hadoop to local mode, instead of using hdfs API(which is default), hadoop file system changes to file:///. Thus, running hadoop fs -ls will not be a hdfs command, but more of hadoop fs -ls file:///, a path to your local directory. None of the JobTracker or NameNode runs.

These blogposts might help:

java eclipse debugging hadoop remote-debugging

Besides the recommended MRUnit I like to debug with eclipse as well. I have a main program. It instantiates a Configuration and executes the MapReduce job directly. I just debug with standard eclipse Debug configurations. Since I include hadoop jars in my mvn spec, I have all hadoop per se in my class path and I have no need to run it against my installed hadoop. I always test with small data sets in local directories to make things easy. The defaults for the configuration behaves as a stand alone hadoop (file system is available)

CodeHunter

How to debug hadoop mapreduce jobs from eclipse?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last