Using Pandas to pd.read_excel() for multiple worksheets of the same workbook Using Pandas to pd.read_excel() for multiple worksheets of the same workbook python python

Using Pandas to pd.read_excel() for multiple worksheets of the same workbook


Try pd.ExcelFile:

xls = pd.ExcelFile('path_to_file.xls')df1 = pd.read_excel(xls, 'Sheet1')df2 = pd.read_excel(xls, 'Sheet2')

As noted by @HaPsantran, the entire Excel file is read in during the ExcelFile() call (there doesn't appear to be a way around this). This merely saves you from having to read the same file in each time you want to access a new sheet.

Note that the sheet_name argument to pd.read_excel() can be the name of the sheet (as above), an integer specifying the sheet number (eg 0, 1, etc), a list of sheet names or indices, or None. If a list is provided, it returns a dictionary where the keys are the sheet names/indices and the values are the data frames. The default is to simply return the first sheet (ie, sheet_name=0).

If None is specified, all sheets are returned, as a {sheet_name:dataframe} dictionary.


There are a few options:

Read all sheets directly into an ordered dictionary.

import pandas as pd# for pandas version >= 0.21.0sheet_to_df_map = pd.read_excel(file_name, sheet_name=None)# for pandas version < 0.21.0sheet_to_df_map = pd.read_excel(file_name, sheetname=None)

Read the first sheet directly into dataframe

df = pd.read_excel('excel_file_path.xls')# this will read the first sheet into df

Read the excel file and get a list of sheets. Then chose and load the sheets.

xls = pd.ExcelFile('excel_file_path.xls')# Now you can list all sheets in the filexls.sheet_names# ['house', 'house_extra', ...]# to read just one sheet to dataframe:df = pd.read_excel(file_name, sheetname="house")

Read all sheets and store it in a dictionary. Same as first but more explicit.

# to read all sheets to a mapsheet_to_df_map = {}for sheet_name in xls.sheet_names:    sheet_to_df_map[sheet_name] = xls.parse(sheet_name)    # you can also use sheet_index [0,1,2..] instead of sheet name.

Thanks @ihightower for pointing it out way to read all sheets and @toto_tico for pointing out the version issue.

sheetname : string, int, mixed list of strings/ints, or None, default 0Deprecated since version 0.21.0: Use sheet_name instead Source Link


You can also use the index for the sheet:

xls = pd.ExcelFile('path_to_file.xls')sheet1 = xls.parse(0)

will give the first worksheet. for the second worksheet:

sheet2 = xls.parse(1)