Concatenate pandas DataFrames generated with a loop Concatenate pandas DataFrames generated with a loop pandas pandas

Concatenate pandas DataFrames generated with a loop


Pandas concat takes a list of dataframes. If you can generate a list of dataframes with your looping function, once you are finished you can concatenate the list together:

data_day_list = []for i, day in enumerate(list_day):  data_day = df[df.day==day]  data_day_list.append(data_day)final_data_day = pd.concat(data_day_list)


Exhausting a generator is more elegant (if not more efficient) than appending to a list. For example:

def yielder(df, list_day):    for i, day in enumerate(list_day):        yield df[df['day'] == day]final_data_day = pd.concat(list(yielder(df, list_day))


Appending or concatenating pd.DataFrames is slow. You can use a list in the interim and then create the final pd.DataFrame at the end with pd.DataFrame.from_records() e.g.:

interim_list = []for i,(k,g) in enumerate(df.groupby(['[*name of your date column here*'])):    if i % 1000 == 0 and i != 0:        print('iteration: {}'.format(i)) # just tells you where you are in iteration    # add your "new features" here...    for v in g.values:        interim_list.append(v)# here you want to specify the resulting df's column list...df_final = pd.DataFrame.from_records(interim_list,columns=['a','list','of','columns'])