How to write DataFrame to postgres table? How to write DataFrame to postgres table? python python

How to write DataFrame to postgres table?


Starting from pandas 0.14 (released end of May 2014), postgresql is supported. The sql module now uses sqlalchemy to support different database flavors. You can pass a sqlalchemy engine for a postgresql database (see docs). E.g.:

from sqlalchemy import create_engineengine = create_engine('postgresql://username:password@localhost:5432/mydatabase')df.to_sql('table_name', engine)

You are correct that in pandas up to version 0.13.1 postgresql was not supported. If you need to use an older version of pandas, here is a patched version of pandas.io.sql: https://gist.github.com/jorisvandenbossche/10841234.
I wrote this a time ago, so cannot fully guarantee that it always works, buth the basis should be there). If you put that file in your working directory and import it, then you should be able to do (where con is a postgresql connection):

import sql  # the patched version (file is named sql.py)sql.write_frame(df, 'table_name', con, flavor='postgresql')


Faster option:

The following code will copy your Pandas DF to postgres DB much faster than df.to_sql method and you won't need any intermediate csv file to store the df.

Create an engine based on your DB specifications.

Create a table in your postgres DB that has equal number of columns as the Dataframe (df).

Data in DF will get inserted in your postgres table.

from sqlalchemy import create_engineimport psycopg2 import io

if you want to replace the table, we can replace it with normal to_sql method using headers from our df and then load the entire big time consuming df into DB.

engine = create_engine('postgresql+psycopg2://username:password@host:port/database')df.head(0).to_sql('table_name', engine, if_exists='replace',index=False) #drops old table and creates new empty tableconn = engine.raw_connection()cur = conn.cursor()output = io.StringIO()df.to_csv(output, sep='\t', header=False, index=False)output.seek(0)contents = output.getvalue()cur.copy_from(output, 'table_name', null="") # null values become ''conn.commit()


Pandas 0.24.0+ solution

In Pandas 0.24.0 a new feature was introduced specifically designed for fast writes to Postgres. You can learn more about it here: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#io-sql-method

import csvfrom io import StringIOfrom sqlalchemy import create_enginedef psql_insert_copy(table, conn, keys, data_iter):    # gets a DBAPI connection that can provide a cursor    dbapi_conn = conn.connection    with dbapi_conn.cursor() as cur:        s_buf = StringIO()        writer = csv.writer(s_buf)        writer.writerows(data_iter)        s_buf.seek(0)        columns = ', '.join('"{}"'.format(k) for k in keys)        if table.schema:            table_name = '{}.{}'.format(table.schema, table.name)        else:            table_name = table.name        sql = 'COPY {} ({}) FROM STDIN WITH CSV'.format(            table_name, columns)        cur.copy_expert(sql=sql, file=s_buf)engine = create_engine('postgresql://myusername:mypassword@myhost:5432/mydatabase')df.to_sql('table_name', engine, method=psql_insert_copy)