How can i debug Hadoop map reduce [duplicate]

java debugging logging hadoop mapreduce

Since you are processing big data, the size of your tracing messages can be huge, so it can cause a problem. It's useful to consider alternatives to "system.out.println" style logging:

use Counters (here is an simple example)
write logs to HDFS using MultipleOutputs

The best thing about Counters and MultipleOutputs - you can programmably access them, in case of MultipleOutputs you can even run map/reduce task to extract some statistics from logs.

An another alternative to debugging on production environment is unit-testing, MiniMRCluster will help you to test your map-reduce jobs during unit testing.

java debugging logging hadoop mapreduce

I develop my map/reduce code in Eclipse using maven to build the runtime jar and to manage dependencies. Once I have hadoop installed and running on my machine to support HDFS, I can run and debug my code in Eclipse. That means using breakpoints and everything else in the Eclipse debug perspective.

CodeHunter

How can i debug Hadoop map reduce [duplicate]

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last