How to create sql alchemy connection for pandas read_sql with sqlalchemy+pyodbc and multiple databases in MS SQL Server? How to create sql alchemy connection for pandas read_sql with sqlalchemy+pyodbc and multiple databases in MS SQL Server? pandas pandas

How to create sql alchemy connection for pandas read_sql with sqlalchemy+pyodbc and multiple databases in MS SQL Server?


So I have found a workaround: use pymssql instead of pyodbc (both in the import statement and in the engine). It lets you build your joins using database names and without specifying them in the engine. And there is no need to specify a driver in this case.

There might be a problem if you are using Python 3.6 which is not supported by pymssql oficially yet, but you can find unofficial wheels for your Python 3.6 here. It works as is supposed to with my queries.

Here is the original code with joins, rebuilt to work with pymssql:

import pandas as pdimport sqlalchemy as sqlimport pymssqlserver = '100.10.10.10'myQuery = '''SELECT first.Field1, second.Field2           FROM db1.schema.Table1 AS first           JOIN db2.schema.Table2 AS second           ON first.Id = second.FirstId'''engine = sql.create_engine('mssql+pymssql://{}'.format(server))df = pd.read_sql_query(myQuery, engine)

As for the unofficial wheels, you need to download the file for Python 3.6 from the link I gave above, then cd to the download folder and run pip install wheels where 'wheels' is the name of the wheels file.

UPDATE:

Actually, it is possible to use pyodbc too. I am not sure if this should work for any SQL Server setup, but everything worked for me after I had set 'master' as my database in the engine. The resulting code would look like this:

import pandas as pdimport sqlalchemy as sqlimport pyodbcserver = '100.10.10.10'driver = 'SQL+Server'db = 'master'myQuery = '''SELECT first.Field1, second.Field2           FROM db1.schema.Table1 AS first           JOIN db2.schema.Table2 AS second           ON first.Id = second.FirstId'''engine = sql.create_engine('mssql+pyodbc://{}/{}?driver={}'.format(server, db, driver))df = pd.read_sql_query(myQuery, engine)