Apache Hive Not Returning YARN Application Results Correctly Apache Hive Not Returning YARN Application Results Correctly hadoop hadoop

Apache Hive Not Returning YARN Application Results Correctly


Someone on the Apache Hive mailing list suggested this was being caused by the YARN container writing its results files to the local machine where it was running instead of HDFS. I did some digging in the source code and found that:

mapreduce.framework.name=local

which is the default in Hadoop 3.2.1, was causing the problem.

Solved with:

set mapreduce.framework.name=yarn


how you inserted data into hive table? Hive gives count(*) result from metastore instead of running a count job to optimise performance. Try MSCK Repair on this table first to let hive know about new external files and modify hive metastore accordingly.