SQL LIKE in Spark SQL
You are only a little bit off. Spark SQL and Hive follow SQL standard conventions where LIKE
operator accepts only two special characters:
_
(underscore) - which matches an arbitrary character.%
(percent) - which matches an arbitrary sequence of characters.
Square brackets have no special meaning and [4,8]
matches only a [4,8]
literal:
spark.sql("SELECT '[4,8]' LIKE '[4,8]'").show
+----------------+|[4,8] LIKE [4,8]|+----------------+| true|+----------------+
To match complex patterns you can use RLIKE
operator which suports Java regular expressions:
spark.sql("SELECT '8NXDPVAE' RLIKE '^[4,8]NXD.V.*$'").show
+-----------------------------+|8NXDPVAE RLIKE ^[4,8]NXD.V.*$|+-----------------------------+| true|+-----------------------------+
Syntax for like in spark scala api:
dataframe.filter(col("columns_name").like("regex"))