how to attach debugger to remote Hadoop instance
A nicely given at LINK
To debug task tracker, do following steps.
Edit conf/hadoop-env.sh to have following
export HADOOP_TASKTRACKER_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=5000,server=y,suspend=n"
Start Hadoop (bin/start-dfs.sh and bin/start-mapred.sh)
- It will block waiting for debug connection
- Connect to the server using Eclipse "Remote Java Application" in the Debug configurations and add the break points
- Run a map reduce Job
I've never done it that way as I'd rather my "real" jobs run unhindered by debug-overhead (which can, under circumstances, change the environment conditions anyway): I debug "locally" against a pseudo-instance (normal debugging in eclipse is absolutely no problem), copying specific files from the live environment once I've isolated (by using e.g. counters) where the problem lies.