how to save data in HDFS with spark?
The path has to be a directory in HDFS.
For example, if you want to save the files inside a folder named myNewFolder
under the root /
path in HDFS.
The path to use would be hdfs://namenode_ip:port/myNewFolder/
On execution of the spark job this directory myNewFolder
will be created.
The datanode data directory which is given for the dfs.datanode.data.dir
in hdfs-site.xml
is used to store the blocks of the files you store in HDFS, should not be referenced as HDFS directory path.