pass Hadoop arguments into Java code pass Hadoop arguments into Java code hadoop hadoop

pass Hadoop arguments into Java code


You can pass the arguments in two ways. Either using -D option or using configuration. But you can only use -D option when you implement Tool interface. If not then you have to set the configuration variables by conf.set.

Passing parameters using -D:

hadoop jar example.jar com.example.driver -D property=value /input/path /output/path

Passing parameters using Configuration:

Configuration conf=new Configuration();conf.set("property","value");Job job=new Job(conf);

Note: All the configuration variables have to be set before initializing Job class


Driver class should implement Tool interface which allow you to use ToolRunner to run your MapReduce job:

public class MRDriver extends Configured implements Tool {    @Override    public int run(String[] args) throws Exception {        /*...*/    }}

Then you'll be able to run jobs by following way:

public static void main(String[] args) throws Exception {    int res = ToolRunner.run(new MRDriver(), args);    System.exit(res);}

It means that all your commannd line parameters parsed by ToolRunner to the current instance of Configuration class.

Assuming you run job from console with following command:

hadoop jar munge-data.jar -Denv1=prod1 -Denv2=prod2

Then in run() method you can get all your arguments from Configuration class:

public int run(String args[]) {    Configuration conf = getConf();    String env1 = conf.get("env1");    String env2 = conf.get("env2");    Job job = new Job(conf, "MR Job");    job.setJarByClass(MRDriver.class);    /*...*/}