How does RecordReader send data to mapper in Hadoop
When does record reader send data to mapper?
Let me answer by giving you an idea how how the mapper and the RecordReader are related. This is the Hadoop code that sends data to the mapper. 1
RecordReader<K1, V1> input; K1 key = input.createKey(); V1 value = input.createValue(); while (input.next(key, value)) { // map pair to output mapper.map(key, value, output, reporter); if(incrProcCount) { reporter.incrCounter(SkipBadRecords.COUNTER_GROUP, SkipBadRecords.COUNTER_MAP_PROCESSED_RECORDS, 1); } }
Basically, the Hadoop will call next
until it returns false
, and at every call key
and value
will obtain new values. Key
being normally the bytes read so far and value
the next line in the file.
Where is the code that send the data to mapper?
That code is at the source code of hadoop (Probably at the MapContextImpl class) but it resembles what I have wrote in the code snippet.
EDIT : The source code is at MapRunner.