How to specify AWS Access Key ID and Secret Access Key as part of a amazon s3n URL
The documentation has the format: http://wiki.apache.org/hadoop/AmazonS3
s3n://ID:SECRET@BUCKET/Path
I suggest you use this:
hadoop distcp \-Dfs.s3n.awsAccessKeyId=<your_access_id> \ -Dfs.s3n.awsSecretAccessKey=<your_access_key> \s3n://origin hdfs://destinations
It also works as a workaround for the occurrence of slashes in the key.The parameters with the id and access key must be supplied exactly in this order: after disctcp and before origin
Passing in the AWS Credentials as part of the Amazon s3n url is not normally recommended, security wise. Especially if that code is pushed to a repository holding service (like github). Ideally set your credentials in the conf/core-site.xml as:
<configuration> <property> <name>fs.s3n.awsAccessKeyId</name> <value>XXXXXX</value> </property> <property> <name>fs.s3n.awsSecretAccessKey</name> <value>XXXXXX</value> </property></configuration>
or reinstall awscli on your machine.
pip install awscli