Copy Files from NFS or Local FS to HDFS
In order for distcp
to work, local file should be accessible from all worker nodes within a cluster, either via mount points on every node to access shared NFS location, or by physically copying it to local file system of every node.
Alternatively, hdfs dfs -put
(or -copyFromLocal
) could still work if you increase the heap size of hadoop client:
$ export HADOOP_CLIENT_OPTS="-DXmx4096m $HADOOP_CLIENT_OPTS"
But as you said, the transfer will be slower compared to distcp.
You can try setting the property in core-site.xml to mount an hdfs path onto a local directory as NFS and then copying the files from your NFS to this path
dfs.nfs3.export.point=[your hdfs path]