Using Hadoop Counters - Multiple jobs
Classic solution is to put job's counter value into a configuration of a subsequent job where you need to access it:
So make sure to increment it correctly in the counting job mapper/reducer:
context.getCounter(CountersClass.N_COUNTERS.SOMECOUNT).increment(1);
Then after counting job completion:
job.waitForCompletion(true);Counter someCount = job.getCounters().findCounter(CountersClass.N_COUNTERS.SOMECOUNT);//put counter value into conf object of the job where you need to access it//you can choose any name for the conf key really (i just used counter enum name here)job2.getConfiguration().setLong(CountersClass.N_COUNTERS.SOMECOUNT.name(), someCount.getValue());
Next piece is to access it in another job's mapper/reducer. Just override setup()For example:
private long someCount;@Overrideprotected void setup(Context context) throws IOException, InterruptedException { super.setup(context); this.someCount = context.getConfiguration().getLong(CountersClass.N_COUNTERS.SOMECOUNT.name(), 0));}
Get the counters at the end of your 1st job and write their value into a file and read it in you sub-sequent job. Write it to HDFS if you want to read it from reducer or to local file if you will read and initialize in the application code.
Counters counters = job.getCounters();Counter c1 = counters.findCounter(COUNTER_NAME);System.out.println(c1.getDisplayName()+":"+c1.getValue());
Reading and writing files is part of basic tutorials.