How to reindex data from one Elasticsearch cluster to another with elasticsearch-hadoop in Spark How to reindex data from one Elasticsearch cluster to another with elasticsearch-hadoop in Spark elasticsearch elasticsearch

How to reindex data from one Elasticsearch cluster to another with elasticsearch-hadoop in Spark


You don't need to configure the node address inside the SparkConf for the matter.

When you use your DataFrameWriter with elasticsearch format, you can pass the node address as an option as followed :

val df = sqlContext.read                  .format("elasticsearch")                  .option("es.nodes", "node1.cluster1:9200")                  .load("your_index/your_type")df.write    .option("es.nodes", "node2.cluster2:9200")    .save("your_new_index/your_new_type")

This should work with spark 1.6.X and the corresponding elasticsearch-hadoop connector.