Unable to run UDF on hive server
After playing with this for awhile I got it to work by putting the .class file into a directory structure that matches it's package, and adding it to the .jar from there. For reference I've included the whole process, including compiling and inserting into hive.
- I used the UDF example here
- Compile it:
javac -classpath $CLASSPATH Lower.java
. Note: the CLASSPATH was defined like this:CLASSPATH=$(ls $HIVE_HOME/lib/hive-serde-*.jar):$(ls $HIVE_HOME/lib/hive-exec-*.jar):$(ls $HADOOP_HOME/hadoop-core-*.jar)
, as described here. - Copy the .class file to a folder com/example/hive/udf/
- Added it to a jar with this command:
jar -cf lower.jar com/example/hive/udf/Lower.class
- Verify that the package looks right:
jar -tf lower.jar
. You should see a line like this:com/example/hive/udf/Lower.class
. - Import the jar into hive.
add jar lower.jar; create temporary function my_lower as 'com.example.hive.udf.Lower';
Fixed the problem using the following steps:
1) Place each UDF jar in /usr/lib/hive/auxlib
2) Specify the path to each jar in hive-site.xml for the hive.aux.jars.path property (ex: file:///usr/lib/hive/auxlib/jar1.jar,file:///usr/lib/hive/auxlib/jar2.jar)
3) Create a script to make a thrift request to the hive server to run create temporary function func_name as 'com.test.udf.ClassName' for each UDF after the hive server is started
Edit: For Hive 0.9 no matter what I did, Hiveserver couldn't find jars in the auxlib directory. To get this to work on Hiveserver 0.9 I had to just dump the jar in the directory specified by Hive's classpath.
You can also do by passing the --auxpath option to the hive commandhive --auxpath /path-to-/hive-examples.jar
or
by setting the HIVE_AUX_JARS_PATH environmental varialble.