Reducers stopped working at 66.68% while running HIVE Join query Reducers stopped working at 66.68% while running HIVE Join query hadoop hadoop

Reducers stopped working at 66.68% while running HIVE Join query


Reducers at 66% start doing the actual reduce (0-33% is shuffle, 33-66% is sort). In a join with hive, the reducer is performing a Cartesian product between the two data sets.

I'm going to guess that there is at least one foreign key that is appearing frequently in all of the data sets. Watch for NULL and default values.

For example, in a join, imagine the key "abc" appears ten times in each of the six tables (10^6). That's a million output records for that one key. If "abc" appears 1000 times in one table, 1000 in another, 1000 in another, then twice in the other three tables, you get 8 billion records (1000^3 * 2^3). You can see how this gets out of hand. I'm guessing there is at least one key that is resulting in a massive number of output records.

This is general good practice to avoid in RDBMS outside of Hive as well. Doing multiple inner joins between many-to-many relationships can get you in a lot of trouble.


For debugging this now, and in the future, you could use the JobTracker to find and examine the logs for the Reducer(s) in question. You can then instrument the reduce operation to get a better handle as to what's going on. be careful you don't blow it up with logging of course!Try looking at the number of records input to the reduce operation for example.