How to write to an existing excel file without overwriting data (using pandas)?

python excel python-2.7 pandas

Pandas docs says it uses openpyxl for xlsx files. Quick look through the code in ExcelWriter gives a clue that something like this might work out:

import pandasfrom openpyxl import load_workbookbook = load_workbook('Masterfile.xlsx')writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') writer.book = book## ExcelWriter for some reason uses writer.sheets to access the sheet.## If you leave it empty it will not know that sheet Main is already there## and will create a new sheet.writer.sheets = dict((ws.title, ws) for ws in book.worksheets)data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])writer.save()

python excel python-2.7 pandas

UPDATE: Starting from Pandas 1.3.0 the following function will not work properly, because functions DataFrame.to_excel() and pd.ExcelWriter() have been changed - a new if_sheet_exists parameter has been introduced, which has invalidated the function below.

Here you can find an updated version of the append_df_to_excel(), which is working for Pandas 1.3.0+.

Here is a helper function:

import osfrom openpyxl import load_workbookdef append_df_to_excel(filename, df, sheet_name='Sheet1', startrow=None,                       truncate_sheet=False,                        **to_excel_kwargs):    """    Append a DataFrame [df] to existing Excel file [filename]    into [sheet_name] Sheet.    If [filename] doesn't exist, then this function will create it.    @param filename: File path or existing ExcelWriter                     (Example: '/path/to/file.xlsx')    @param df: DataFrame to save to workbook    @param sheet_name: Name of sheet which will contain DataFrame.                       (default: 'Sheet1')    @param startrow: upper left cell row to dump data frame.                     Per default (startrow=None) calculate the last row                     in the existing DF and write to the next row...    @param truncate_sheet: truncate (remove and recreate) [sheet_name]                           before writing DataFrame to Excel file    @param to_excel_kwargs: arguments which will be passed to `DataFrame.to_excel()`                            [can be a dictionary]    @return: None    Usage examples:    >>> append_df_to_excel('d:/temp/test.xlsx', df)    >>> append_df_to_excel('d:/temp/test.xlsx', df, header=None, index=False)    >>> append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2',                           index=False)    >>> append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2',                            index=False, startrow=25)    (c) [MaxU](https://stackoverflow.com/users/5741205/maxu?tab=profile)    """    # Excel file doesn't exist - saving and exiting    if not os.path.isfile(filename):        df.to_excel(            filename,            sheet_name=sheet_name,             startrow=startrow if startrow is not None else 0,             **to_excel_kwargs)        return        # ignore [engine] parameter if it was passed    if 'engine' in to_excel_kwargs:        to_excel_kwargs.pop('engine')    writer = pd.ExcelWriter(filename, engine='openpyxl', mode='a')    # try to open an existing workbook    writer.book = load_workbook(filename)        # get the last row in the existing Excel sheet    # if it was not specified explicitly    if startrow is None and sheet_name in writer.book.sheetnames:        startrow = writer.book[sheet_name].max_row    # truncate sheet    if truncate_sheet and sheet_name in writer.book.sheetnames:        # index of [sheet_name] sheet        idx = writer.book.sheetnames.index(sheet_name)        # remove [sheet_name]        writer.book.remove(writer.book.worksheets[idx])        # create an empty sheet [sheet_name] using old index        writer.book.create_sheet(sheet_name, idx)        # copy existing sheets    writer.sheets = {ws.title:ws for ws in writer.book.worksheets}    if startrow is None:        startrow = 0    # write out the new sheet    df.to_excel(writer, sheet_name, startrow=startrow, **to_excel_kwargs)    # save the workbook    writer.save()

Tested with the following versions:

Pandas 1.2.3
Openpyxl 3.0.5

python excel python-2.7 pandas

With openpyxlversion 2.4.0 and pandasversion 0.19.2, the process @ski came up with gets a bit simpler:

import pandasfrom openpyxl import load_workbookwith pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') as writer:    writer.book = load_workbook('Masterfile.xlsx')    data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])#That's it!

CodeHunter

How to write to an existing excel file without overwriting data (using pandas)?

Tested with the following versions:

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last