DistCp fault tolerance between two remote clusters DistCp fault tolerance between two remote clusters hadoop hadoop

DistCp fault tolerance between two remote clusters


distcp uses MapReduce to effect its distribution, error handling and recovery, and reporting.

Please see Update and Overwrite

You can use -overwrite option to avoid duplicates Moreover, you can check update option as well. If network connection fails, once its connection recovered then you can re-initiate with overwrite option

See the examples of -update and -overwrite as mentioned in above guide link.


Here is the link for refactored distcp:https://hadoop.apache.org/docs/r2.7.2/hadoop-distcp/DistCp.html

As "@RamPrasad G" mentioned, I guess you have no option other than redo the distcp in case of network failure.

Some good reads:

Hadoop distcp network failures with WebHDFS

http://www.ghostar.org/2015/08/hadoop-distcp-network-failures-with-webhdfs/

Distcp between two HA Cluster

http://henning.kropponline.de/2015/03/15/distcp-two-ha-cluster/

Transferring Data to/from Altiscale via S3 using DistCp

https://documentation.altiscale.com/transferring-data-using-distcpThis page has a link for a shell script with retry, which could be helpful to you.

Note: Thanks to original authors.