Reading an Excel file in python using pandas Reading an Excel file in python using pandas pandas pandas

Reading an Excel file in python using pandas


Close: first you call ExcelFile, but then you call the .parse method and pass it the sheet name.

>>> xl = pd.ExcelFile("dummydata.xlsx")>>> xl.sheet_names[u'Sheet1', u'Sheet2', u'Sheet3']>>> df = xl.parse("Sheet1")>>> df.head()                  Tid  dummy1    dummy2    dummy3    dummy4    dummy5  \0 2006-09-01 00:00:00       0  5.894611  0.605211  3.842871  8.265307   1 2006-09-01 01:00:00       0  5.712107  0.605211  3.416617  8.301360   2 2006-09-01 02:00:00       0  5.105300  0.605211  3.090865  8.335395   3 2006-09-01 03:00:00       0  4.098209  0.605211  3.198452  8.170187   4 2006-09-01 04:00:00       0  3.338196  0.605211  2.970015  7.765058        dummy6  dummy7    dummy8    dummy9  0  0.623354       0  2.579108  2.681728  1  0.554211       0  7.210000  3.028614  2  0.567841       0  6.940000  3.644147  3  0.581470       0  6.630000  4.016155  4  0.595100       0  6.350000  3.974442  

What you're doing is calling the method which lives on the class itself, rather than the instance, which is okay (although not very idiomatic), but if you're doing that you would also need to pass the sheet name:

>>> parsed = pd.io.parsers.ExcelFile.parse(xl, "Sheet1")>>> parsed.columnsIndex([u'Tid', u'dummy1', u'dummy2', u'dummy3', u'dummy4', u'dummy5', u'dummy6', u'dummy7', u'dummy8', u'dummy9'], dtype=object)


This is much simple and easy way.

import pandasdf = pandas.read_excel(open('your_xls_xlsx_filename','rb'), sheetname='Sheet 1')# or using sheet index starting 0df = pandas.read_excel(open('your_xls_xlsx_filename','rb'), sheetname=2)

check out documentation full detailshttp://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.read_excel.html

FutureWarning: The sheetname keyword is deprecated for newer Pandas versions, use sheet_name instead.


Thought i should add here, that if you want to access rows or columns to loop through them, you do this:

import pandas as pd# open the filexlsx = pd.ExcelFile("PATH\FileName.xlsx")# get the first sheet as an objectsheet1 = xlsx.parse(0)    # get the first column as a list you can loop through# where the is 0 in the code below change to the row or column number you want    column = sheet1.icol(0).real# get the first row as a list you can loop throughrow = sheet1.irow(0).real

Edit:

The methods icol(i) and irow(i) are deprecated now. You can use sheet1.iloc[:,i] to get the i-th col and sheet1.iloc[i,:] to get the i-th row.