How to do predicate pushdown with array_contains and ElasticSearch data source? How to do predicate pushdown with array_contains and ElasticSearch data source? elasticsearch elasticsearch

How to do predicate pushdown with array_contains and ElasticSearch data source?


The failure to pushdown is expected. For predicate to be delegated you need a Data Source support, and ElasticSearch connector doesn't list array_contains among pushed operations, which as today include:

  • =, => , <, >= , <=
  • is_null / is_not_null
  • in
  • String[Starts|Ends]With, StringContains
  • NULL safe equality.
  • Application of Boolean operators AND / OR / NOT.

Also any additional transformations (including CAST) disable predicate pushdown.


array_contains does not generate a data source filter predicate so no connector could could ever have a chance to support this for a predicate pushdown.


array_contains creates a ArrayContains Catalyst predicate expression that is not converted to a data source filter predicate when DataSourceStrategy planning strategy is requested to translateFilter.

There is Contains predicate expression among the expressions supported but not ArrayContains. Think you should report it in the Spark JIRA issue tracking system.