Integration testing Hive jobs
Ideally one would be able to test hive queries with LocalJobRunner
rather than resorting to mini-cluster testing. However, due to HIVE-3816 running hive with mapred.job.tracker=local
results in a call to the hive CLI executable installed on the system (as described in your question).
Until HIVE-3816 is resolved, mini-cluster testing is the only option. Below is a minimal mini-cluster setup for hive tests that I have tested against CDH 4.4.
Configuration conf = new Configuration();/* Build MiniDFSCluster */MiniDFSCluster miniDFS = new MiniDFSCluster.Builder(conf).build();/* Build MiniMR Cluster */System.setProperty("hadoop.log.dir", "/path/to/hadoop/log/dir"); // MAPREDUCE-2785int numTaskTrackers = 1;int numTaskTrackerDirectories = 1;String[] racks = null;String[] hosts = null;miniMR = new MiniMRCluster(numTaskTrackers, miniDFS.getFileSystem().getUri().toString(), numTaskTrackerDirectories, racks, hosts, new JobConf(conf));/* Set JobTracker URI */System.setProperty("mapred.job.tracker", miniMR.createJobConf(new JobConf(conf)).get("mapred.job.tracker"));
There is no need to run a separate hiveserver or hiveserver2 process for testing. You can test with an embedded hiveserver2 process by setting your jdbc connection URL to jdbc:hive2:///
I have implemented HiveRunner.
https://github.com/klarna/HiveRunner
We tested it on Mac and had some trouble with Windows, however with a few changes listed below the util served well.
For windows here are some of the changes that were done in order to have HiveRunner work in windows environment. After these changes unit testing is possible for all Hive queries.
1.Clone the project at https://github.com/steveloughran/winutils to anywhere on your computer, Add a new environment variable, HADOOP_HOME, pointing to the /bin directory of that folder. no forward slashes or spaces allowed.2.Clone the project at https://github.com/sakserv/hadoop-mini-clusters to anywhere on your computer. Add a new environment variable HADOOP_WINDOWS_LIBS, pointing to the /lib directory of that folder. Again, no forward slashes or spaces allowed.3.I also installed cygwin, assuming severla win utils for linux might be available through.
This pull on gitbub helped with making it work on windows,https://github.com/klarna/HiveRunner/pull/63