Has anyone been able to use elasticsearch xpack sql with Spark?
I don't think this feature is supported. An alternative solution in PySpark would have been to use the JDBC driver, which I did try. I tried the following:
es_df = spark.read.jdbc(url="jdbc:es://http://192.168.1.71:9200", table = "(select * from eg_flight) mytable")
and I got the following error:
Py4JJavaError: An error occurred while calling o2488.jdbc.: java.sql.SQLFeatureNotSupportedException: Found 1 problem(s)line 1:8: Unexecutable item...
An alternative would be to do it using core Python and request as such but it I won't recommend it for large datasets.
import requests as rimport jsones_template = { "query": "select * from eg_flight"}es_link = "http://192.168.1.71:9200/_xpack/sql"headers = {'Content-type': 'application/json'}if __name__ == "__main__": load = r.post(es_link, data=json.dumps(es_template), headers=headers) if load.status_code == 200: load = load.json() #do something with it