Elastic Map Reduce External Jars Elastic Map Reduce External Jars hadoop hadoop

Elastic Map Reduce External Jars


The best luck I have had with external jar dependencies is to copy them (via bootstrap action) to /home/hadoop/lib throughout the cluster. That path is on the classpath of every host. This technique is the only one that seems to work regardless of where the code lives that accesses external jars (tool, job, or task).


One option is to have the first step in your jobflow set up the JARs wherever they need to be. Or, if they are dependencies, you can package them in with your application JAR (which is probably in S3).


FYI for newer versions of EMR /home/hadoop/lib is not used anymore. /usr/lib/hadoop-mapreduce should be used.