Curl download to HDFS
The hdfs dfs -put
command can accept file input from stdin, using the familiar idiom of specifying -
to mean stdin:
> curl -sS https://www.google.com/robots.txt | hdfs dfs -put - /robots.txt> hdfs dfs -ls /robots.txt-rw-r--r-- 3 cnauroth supergroup 6880 2017-07-06 09:07 /robots.txt
Another option is to use shell process substitution to allow treating the stdout of curl
(or really any command you choose) as if it were a file input to another command:
> hdfs dfs -put <(curl -sS https://www.google.com/robots.txt) /robots.txt> hdfs dfs -ls /robots.txt-rw-r--r-- 3 cnauroth supergroup 6880 2017-07-05 15:07 /robots.txt