Filling empty python dataframe using loops Filling empty python dataframe using loops pandas pandas

Filling empty python dataframe using loops


import pandas as pdyears = [2013, 2014, 2015]dn = []for year in years:    df1 = pd.DataFrame({'Incidents': [ 'C', 'B','A'],                 year: [1, 1, 1 ],                }).set_index('Incidents')    dn.append(df1)dn = pd.concat(dn, axis=1)print(dn)

yields

           2013  2014  2015Incidents                  C             1     1     1B             1     1     1A             1     1     1

Note that calling pd.concat once outside the loop is more time-efficientthan calling pd.concat with each iteration of the loop.

Each time you call pd.concat new space is allocated for a new DataFrame, andall the data from each component DataFrame is copied into the new DataFrame. Ifyou call pd.concat from within the for-loop then you end up doing on the orderof n**2 copies, where n is the number of years.

If you accumulate the partial DataFrames in a list and call pd.concat onceoutside the list, then Pandas only needs to perform n copies to make dn.


As far as I know you should avoid to add line by line to the dataframe due to speed issue

What I usually do is:

l1 = []l2 = []for i in range(n):   compute value v1   compute value v2   l1.append(v1)   l2.append(v2)d = pd.DataFrame()d['l1'] = l1d['l2'] = l2