Connect to SQLite in Apache Spark Connect to SQLite in Apache Spark sqlite sqlite

Connect to SQLite in Apache Spark


There are two options you can try

Use JDBC directly

  • Open a separate, plain JDBC connection in your Spark job
  • Get the tables names from the JDBC meta data
  • Feed these into your for comprehension

Use a SQL query for the "dbtable" argument

You can specify a query as the value for the dbtable argument. Syntactically this query must "look" like a table, so it must be wrapped in a sub query.

In that query, get the meta data from the database:

val df = sqlContext.read.format("jdbc").options(  Map(    "url" -> "jdbc:postgresql:xxx",    "user" -> "x",    "password" -> "x",    "dbtable" -> "(select * from pg_tables) as t")).load()

This example works with PostgreSQL, you have to adapt it for SQLite.

Update

It seems that the JDBC driver only supports to iterate over one result set.Anyway, when you materialize the list of table names using collect(), then the following snippet should work:

val myTableNames = metaData.select("tbl_name").map(_.getString(0)).collect()for (t <- myTableNames) {  println(t.toString)  val tableData = sqlContext.read.format("jdbc")    .options(      Map(        "url" -> "jdbc:sqlite:/x.db",        "dbtable" -> t)).load()  tableData.show()}