Copy Solr HDFS Data to another Cluster
please follow below steps to create snapshot of solr_hdfs folder and move the same on another cluster
1.Allow snapshot
sudo -u hdfs hadoop dfsadmin -allowSnapshot /user/solr/SolrCollectionName
2.Create snapshot with a specific name
sudo -u hdfs hadoop dfs -createSnapshot /user/solr/SolrCollectionName/ snapshotName
3. To list to snapshot directory
hdfs dfs -ls /user/solr/solrcollectionName/.snapshot
4. To copy, execute below command
sudo -u solr hadoop distcp hdfs://NNIP1:8020/user/solr/collectionName/.snapshot/SanpshotName hdfs://NNIP2:8020/user/solr
5. To restore snapshot
sudo -u solr hadoop fs -cp /user/solr/SanpshotName/* /user/solr/SolrcollectionName/
After a lot of trying this is the solution we worked out.- Initialise solr in the second environment with all the collections in the same way as the primary.- Take a snapshot of HDFS- Use hadoop hdfs -cp to copy the data up to the checkpointAfter the first run the copy job will be quick as you are only copying the increments.