Parallelizing pandas pyodbc SQL database calls

python sql multithreading pandas pyodbc

Yes, this should work, although with the caveat that you'll need to change parallel_connection.py in that talk that you site. In that code there's a fetchall function which executes each of the cursors in parallel, then combines the results. This is the core of what you'll change:

Old Code:

def fetchall(self):    results = [None] * len(self.cursors)    def do_work(index, cursor):        results[index] = cursor.fetchall()    self._do_parallel(do_work)    return list(chain(*[rs for rs in results]))

New Code:

def fetchall(self):    results = [None] * len(self.sql_connections)    def do_work(index, sql_connection):        sql, conn = sql_connection  #  Store tuple of sql/conn instead of cursor        results[index] = pd.read_sql(sql, conn)    self._do_parallel(do_work)    return pd.DataFrame().append([rs for rs in results])

Repo: https://github.com/godatadriven/ParallelConnection

CodeHunter

Parallelizing pandas pyodbc SQL database calls

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last