Is Hadoop's TooRunner thread-safe?
Confirmed that ToolRunner is NOT thread-safe:
Original code (which runs into problems):
public static int run(Configuration conf, Tool tool, String[] args) throws Exception{if(conf == null) { conf = new Configuration();}GenericOptionsParser parser = new GenericOptionsParser(conf, args);//set the configuration back, so that Tool can configure itselftool.setConf(conf);//get the args w/o generic hadoop argsString[] toolArgs = parser.getRemainingArgs();return tool.run(toolArgs);
}
New Code(which works):
public static int run(Configuration conf, Tool tool, String[] args) throws Exception{ if(conf == null) { conf = new Configuration(); } GenericOptionsParser parser = getParser(conf, args); tool.setConf(conf); //get the args w/o generic hadoop args String[] toolArgs = parser.getRemainingArgs(); return tool.run(toolArgs);}private static synchronized GenericOptionsParser getParser(Configuration conf, String[] args) throws Exception { return new GenericOptionsParser(conf, args);}