Read data from remote hive on spark over JDBC returns empty result Read data from remote hive on spark over JDBC returns empty result hadoop hadoop

Read data from remote hive on spark over JDBC returns empty result


Paul Staab replied on this issue in Spark jira.Here is solution:

  1. Create an Hive Dialect which uses the correct quotes for escaping the column names:

    object HiveDialect extends JdbcDialect {    override def canHandle(url: String): Boolean = url.startsWith("jdbc:hive2")    override def quoteIdentifier(colName: String): String = s"`$colName`"}
  2. Register it before making the call with spark.read.jdbc

    JdbcDialects.registerDialect(HiveDialect)
  3. Execute spark.read.jdbc with fetchsize option

    spark.read.jdbc("jdbc:hive2://localhost:10000/default","test1",properties={"driver": "org.apache.hive.jdbc.HiveDriver", "fetchsize": "10"}).show()


You should add this into your options:

 .option("fetchsize", "10")