How to save only non empty reducers' output in HDFS How to save only non empty reducers' output in HDFS hadoop hadoop

How to save only non empty reducers' output in HDFS


It is possible - see the documentation section on "Lazy Output":

http://hadoop.apache.org/mapreduce/docs/current/mapred_tutorial.html#Lazy+Output+Creation

import org.apache.hadoop.mapreduce.lib.output.LazyOutputFormat;LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class); 


If you're using the old API, you can use the NullOutputFormat class:

import org.apache.hadoop.mapred.lib.NullOutputFormat;conf.setOutputFormat(NullOutputFormat.class);