Getting Google Spreadsheet CSV into A Pandas Dataframe Getting Google Spreadsheet CSV into A Pandas Dataframe python python

Getting Google Spreadsheet CSV into A Pandas Dataframe


Seems to work for me without the StringIO:

test = pd.read_csv('https://docs.google.com/spreadsheets/d/' +                    '0Ak1ecr7i0wotdGJmTURJRnZLYlV3M2daNTRubTdwTXc' +                   '/export?gid=0&format=csv',                   # Set first column as rownames in data frame                   index_col=0,                   # Parse column values to datetime                   parse_dates=['Quradate']                  )test.head(5)  # Same result as @TomAugspurger

BTW, including the ?gid= enables importing different sheets, find the gid in the URL.


You can use read_csv() on a StringIO object:

from io import BytesIOimport requestsimport pandas as pdr = requests.get('https://docs.google.com/spreadsheet/ccc?key=0Ak1ecr7i0wotdGJmTURJRnZLYlV3M2daNTRubTdwTXc&output=csv')data = r.content    In [10]: df = pd.read_csv(BytesIO(data), index_col=0,parse_dates=['Quradate'])In [11]: df.head()Out[11]:           City                                            region     Res_Comm  \0       Dothan  South_Central-Montgomery-Auburn-Wiregrass-Dothan  Residential   10       Foley                              South_Mobile-Baldwin  Residential   12  Birmingham      North_Central-Birmingham-Tuscaloosa-Anniston   Commercial   38       Brent      North_Central-Birmingham-Tuscaloosa-Anniston  Residential   44      Athens                 North_Huntsville-Decatur-Florence  Residential             mkt_type            Quradate  National_exp  Alabama_exp  Sales_exp  \0            Rural 2010-01-15 00:00:00             2            2          3   10  Suburban_Urban 2010-01-15 00:00:00             4            4          4   12  Suburban_Urban 2010-01-15 00:00:00             2            2          3   38           Rural 2010-01-15 00:00:00             3            3          3   44  Suburban_Urban 2010-01-15 00:00:00             4            5          4       Inventory_exp  Price_exp  Credit_exp  0               2          3           3  10              4          4           3  12              2          2           3  38              3          3           2  44              4          4           4  


Open the specific sheet you want in your browser. Make sure it's at least viewable by anyone with the link. Copy and paste the URL. You'll get something like https://docs.google.com/spreadsheets/d/BLAHBLAHBLAH/edit#gid=NUMBER.

sheet_url = 'https://docs.google.com/spreadsheets/d/BLAHBLAHBLAH/edit#gid=NUMBER'

First we turn that into a CSV export URL, like https://docs.google.com/spreadsheets/d/BLAHBLAHBLAH/export?format=csv&gid=NUMBER:

csv_export_url = sheet_url.replace('/edit#gid=', '/export?format=csv&gid=')

Then we pass it to pd.read_csv, which can take a URL.

df = pd.read_csv(csv_export_url)

This will break if Google changes its API (it seems undocumented), and may give unhelpful errors if a network failure occurs.