Hadoop setInputPathFilter error

Alternatively, you may try to loop through all of the files in the given directory and check if the file names begin with train. E.g:

        Job job = new Job(conf, "myJob");        List<Path> inputhPaths = new ArrayList<Path>();        String basePath = "/user/hadoop/path";        FileSystem fs = FileSystem.get(conf);        FileStatus[] listStatus = fs.globStatus(new Path(basePath + "/train*"));        for (FileStatus fstat : listStatus) {            inputhPaths.add(fstat.getPath());        }        FileInputFormat.setInputPaths(job,                (Path[]) inputhPaths.toArray(new Path[inputhPaths.size()]));

A quick fix, You can blacklist paths instead of whitelisting like return false if path contains "test"

You can get a FileSystem instance by having your Filter implement the Configurable interface (or extend the Configured class), and create a fileSystem instance variable in the setConf method:

class TrainFilter extends Configured implements PathFilter{   FileSystem fileSystem;   boolean accept(Path path)   {      // TODO: use fileSystem here to determine if path is a directory      return path.toString().contains("train");   }   public void setConf(Configuration conf) {     if (conf != null) {       fileSystem = FileSystem.get(conf);     }   }}