Filter a data-frame and add a new column according to the given condition Filter a data-frame and add a new column according to the given condition python-3.x python-3.x

Filter a data-frame and add a new column according to the given condition


Today's edition of Over Engineered with Numpy

Though admittedly very little obvious Numpy

i, rows = pd.factorize([*zip(df.ID, df.col1.replace('None'))])k, cols = pd.factorize(df.groupby(i).cumcount())dleft = pd.DataFrame(dict(zip(['ID', 'col1'], zip(*rows))))drigt = pd.DataFrame(index=dleft.index, columns=np.arange(len(cols)) + 2).add_prefix('col')drigt.values[i, k] = df.col2.valuesdleft.join(drigt)   ID        col1        col2        col30   1  Abc street  2017-07-27  2017-08-171   1  Def street  2018-07-15  2018-08-132   2  fbg street  2018-01-07  2018-08-123   2  trf street  2019-01-15         NaN


I am using cumcount with merge

df1=df.loc[df.col1.ne('None'),:].copy()df2=df.loc[df.col1.eq('None'),:].copy()df1['Key']=df1.groupby('ID').cumcount()df2['Key']=df2.groupby('ID').cumcount()df1.merge(df2.drop('col1',1),on=['ID','Key'],how='left')Out[816]:    ID       col1      col2_x  Key      col2_y0   1  Abcstreet  2017-07-27    0  2017-08-171   1  Defstreet  2018-07-15    1  2018-08-132   2  fbgstreet  2018-01-07    0  2018-08-123   2  trfstreet  2019-01-15    1         NaN


Using ffill + pivot_table. This assumes that None follows the proper value, which it appears to from your data.


u = df.assign(col1=df.col1.replace('None'))g = ['ID', 'col1']idx = u.groupby(g).cumcount()(u.assign(idx=idx)    .pivot_table(index=g, columns='idx', values='col2', aggfunc='first')    .reset_index())  

idx   ID        col1           0           10      1  Abc street  2017-07-27  2017-08-171      1  Def street  2018-07-15  2018-08-132      2  fbg street  2018-01-07  2018-08-123      2  trf street  2019-01-15         NaN