Hadoop seems to modify my key object during an iteration over values of a given reduce call Hadoop seems to modify my key object during an iteration over values of a given reduce call hadoop hadoop

Hadoop seems to modify my key object during an iteration over values of a given reduce call


This is expected behavior (with the new API at least).

When the next method for the underlying iterator of the values Iterable is called, the next key/value pair is read from the sorted mapper / combiner output, and checked that the key is still part of the same group as the previous key.

Because hadoop re-uses the objects passed to the reduce method (just calling the readFields method of the same object) the underlying contents of the Key parameter 'k' will change with each iteration of the values Iterable.