Adding new files to a running hadoop cluster Adding new files to a running hadoop cluster hadoop hadoop

Adding new files to a running hadoop cluster


Unfortunately this is not possible with MapReduce. When you initiate a MapReduce Job, part of the setup process is determining block locations of your input. If the input is only partially there, the setup process will only work on those blocks and wont dynamically add inputs.

If you are looking for a stream processor, have a look at Apache Storm https://storm.apache.org/ or Apache Spark https://spark.apache.org/