unpacking a sql select into a pandas dataframe
You can pass a cursor object to the DataFrame constructor. For postgres:
import psycopg2conn = psycopg2.connect("dbname='db' user='user' host='host' password='pass'")cur = conn.cursor()cur.execute("select instrument, price, date from my_prices")df = DataFrame(cur.fetchall(), columns=['instrument', 'price', 'date'])
then set index like
df.set_index('date', drop=False)
or directly:
df.index = df['date']
Update: recent pandas have the following functions: read_sql_table
and read_sql_query
.
First create a db engine (a connection can also work here):
from sqlalchemy import create_engine# see sqlalchemy docs for how to write this url for your database type:engine = create_engine('mysql://scott:tiger@localhost/foo')
pandas_read_sql_table
table_name = 'my_prices'df = pd.read_sql_table(table_name, engine)
pandas_read_sql_query
df = pd.read_sql_query("SELECT instrument, price, date FROM my_prices;", engine)
The old answer had referenced read_frame which is has been deprecated (see the version history of this question for that answer).
It's often makes sense to read first, and then perform transformations to your requirements (as these are usually efficient and readable in pandas). In your example, you can pivot
the result:
df.reset_index().pivot('date', 'instrument', 'price')
Note: You could miss out the reset_index
you don't specify an index_col
in the read_frame
.
This connect with postgres and pandas with remote postgresql
# CONNECT TO POSTGRES USING PANDASimport psycopg2 as pgimport pandas.io.sql as psql
this is used to establish the connection with postgres db
connection = pg.connect("host=192.168.0.1 dbname=db user=postgres")
this is used to read the table from postgres db
dataframe = psql.read_sql("SELECT * FROM DB.Table", connection)