Copy Solr HDFS Data to another Cluster Copy Solr HDFS Data to another Cluster hadoop hadoop

Copy Solr HDFS Data to another Cluster


please follow below steps to create snapshot of solr_hdfs folder and move the same on another cluster

1.Allow snapshot

sudo -u hdfs hadoop dfsadmin -allowSnapshot /user/solr/SolrCollectionName

2.Create snapshot with a specific name

sudo -u hdfs hadoop dfs -createSnapshot /user/solr/SolrCollectionName/ snapshotName

3. To list to snapshot directory

hdfs dfs -ls /user/solr/solrcollectionName/.snapshot

4. To copy, execute below command

 sudo -u solr hadoop distcp hdfs://NNIP1:8020/user/solr/collectionName/.snapshot/SanpshotName  hdfs://NNIP2:8020/user/solr

5. To restore snapshot

sudo -u solr hadoop fs -cp /user/solr/SanpshotName/* /user/solr/SolrcollectionName/


After a lot of trying this is the solution we worked out.- Initialise solr in the second environment with all the collections in the same way as the primary.- Take a snapshot of HDFS- Use hadoop hdfs -cp to copy the data up to the checkpointAfter the first run the copy job will be quick as you are only copying the increments.