Reading data from SQL Server using Spark SQL

In Spark 1.4.0+ you can now use sqlContext.read.jdbc

That will give you a DataFrame instead of an RDD of Row objects.

The equivalent to the solution you posted above would be

sqlContext.read.jdbc("jdbc:sqlserver://omnimirror;databaseName=moneycorp;integratedSecurity=true;", "TABLE_NAME", "id", 1, 100000, 1000, new java.util.Properties)

It should pick up the schema of the table, but if you'd like to force it, you can use the schema method after read sqlContext.read.schema(...insert schema here...).jdbc(...rest of the things...)

Note that you won't get an RDD of SomeClass here (which is nicer in my view). Instead you'll get a DataFrame of the relevant fields.

More information can be found here: http://spark.apache.org/docs/latest/sql-programming-guide.html#jdbc-to-other-databases

sql apache-spark apache-spark-sql

Found a solution to this from the mailing list. JdbcRDD can be used to accomplish this. I needed to get the MS Sql Server JDBC driver jar and add it to the lib for my project. I wanted to use integrated security, and so needed to put sqljdbc_auth.dll (available in the same download) in a location that java.library.path can see. Then, the code looks like this:

     val rdd = new JdbcRDD[Email](sc,          () => {DriverManager.getConnection( "jdbc:sqlserver://omnimirror;databaseName=moneycorp;integratedSecurity=true;")},          "SELECT * FROM TABLE_NAME Where ? < X and X < ?",            1, 100000, 1000,          (r:ResultSet) => { SomeClass(r.getString("Col1"),             r.getString("Col2"), r.getString("Col3")) } )

This gives an Rdd of SomeClass.The second, third and fourth parameters are required and are for lower and upper bounds, and number of partitions. In other words, that source data needs to be partitionable by longs for this to work.

CodeHunter

Reading data from SQL Server using Spark SQL

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last