Spark Dataframe upsert to Elasticsearch Spark Dataframe upsert to Elasticsearch elasticsearch elasticsearch

Spark Dataframe upsert to Elasticsearch


The reason why mode("Overwrite") was a problem is that when you overwrite your entire dataframe it deletes all data that matches with your rows of dataframe at once and it looks like the entire index is empty for me and I figure out how to actually upsert it

here is my code

df.write  .format("org.elasticsearch.spark.sql")  .option("es.nodes.wan.only","true")  .option("es.nodes.discovery", "false")  .option("es.nodes.client.only", "false")  .option("es.net.ssl","true")  .option("es.mapping.id", index)  .option("es.write.operation", "upsert")  .option("es.nodes", esURL)  .option("es.port", "443")  .mode("append")  .save(path)

Note that you have to put "es.write.operation", "upert" and .mode("append")


Try setting:

es.write.operation = upsert

This should perform the required operation. You can find more details in https://www.elastic.co/guide/en/elasticsearch/hadoop/current/configuration.html