Using ENV vars in distributed Hadoop cluster Using ENV vars in distributed Hadoop cluster hadoop hadoop

Using ENV vars in distributed Hadoop cluster


You can force "cluster-wide" environment variables via mapred-site.xml and yarn-site.xml -- but I'm not 100% sure which properties must be set in the configuration of the ResourceManager service, and/or every NodeManager service, and/or client nodes. And which level overrides (or adds to) which level. You will have to do some research & experimentation.

Look into the documentation for mapred-default.xml and yarn-default.xml (e.g. here and here for Hadoop 2.7.0) for properties such as...

mapred.child.envmapreduce.admin.user.envyarn.app.mapreduce.am.envyarn.app.mapreduce.am.admin.user.envyarn.nodemanager.admin-envyarn.nodemanager.env-whitelist

[Edit] look also into these properties that have no proper entry in the "default" listings (yet another documentation bug...) and forget about the "mapred.child" stuff

mapreduce.map.env mapreduce.reduce.env 


For Oozie jobs, there are two ways to set env. variables:

  • Shell actions have an explicit <env-var>VAR=VALUE</env-var> syntax, because shell scripts rely a lot on env. variables
  • all actions that use a "launcher" YARN job (i.e. Java, Pig, Sqoop, Spark, Hive, Hive2, Shell...) can benefit from a
      <property>
        <name>oozie.launcher.xxx.xxx.xxx.env</name><value>****</value>
      </property>
    to override the values in client config files that are mentioned above
  • MapReduce actions are launched directly, there is no "launcher" job, so the property would be set directly as
      <property>
        <name>xxx.xxx.xxx.env</name><value>****</value>
      </property>
  • in addition, the actions defined in the core Workflow schema (i.e. Java, Pig, MapReduce) can use the <global> section to define the property just once
    => alas, the other actions are defined as plug-ins with a distinct XML schema, and do not inherit the Global properties...

Unfortunately the documentation for Oozie (e.g. here for Oozie 4.1) is completely silent about the oozie.launcher.* properties, you will have to make some research in Stack Overflow -- in that post for example.