How to specify AWS Access Key ID and Secret Access Key as part of a amazon s3n URL How to specify AWS Access Key ID and Secret Access Key as part of a amazon s3n URL hadoop hadoop

How to specify AWS Access Key ID and Secret Access Key as part of a amazon s3n URL


I suggest you use this:

hadoop distcp \-Dfs.s3n.awsAccessKeyId=<your_access_id> \ -Dfs.s3n.awsSecretAccessKey=<your_access_key> \s3n://origin hdfs://destinations

It also works as a workaround for the occurrence of slashes in the key.The parameters with the id and access key must be supplied exactly in this order: after disctcp and before origin


Passing in the AWS Credentials as part of the Amazon s3n url is not normally recommended, security wise. Especially if that code is pushed to a repository holding service (like github). Ideally set your credentials in the conf/core-site.xml as:

<configuration>  <property>    <name>fs.s3n.awsAccessKeyId</name>    <value>XXXXXX</value>  </property>  <property>    <name>fs.s3n.awsSecretAccessKey</name>    <value>XXXXXX</value>  </property></configuration>

or reinstall awscli on your machine.

pip install awscli