Elasticsearch for spark 3.0 Elasticsearch for spark 3.0 elasticsearch elasticsearch

Elasticsearch for spark 3.0


Spark 3.0.0 relies on Scala 2.12, which is not yet supported by Elasticsearch-hadoop. This and a few further issues prevent us using Spark 3.0.0 together with Elasticsearch. If you want to compile it yourself, there is a pull-request on elasticsearch-hadoop (https://github.com/elastic/elasticsearch-hadoop/pull/1308) which should at least allow using scala 2.12. Not sure if it will fix the other issues as well.


It is not official for now, but you can compile the dependency onhttps://github.com/elastic/elasticsearch, the steps are

  1. git clone https://github.com/elastic/elasticsearch.git
  2. cd elasticsearch-hadoop/
  3. vim ~/.bashrc
  4. export JAVA8_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
  5. source ~/.bashrc
  6. ./gradlew elasticsearch-spark-30:distribution --console=plain

and finally you can find .jar package in folder: "elasticsearch-hadoop\spark\sql-30\build\distributions", elasticsearch-spark-30_2.12-8.0.0-SNAPSHOT.jar is the es packages