Reshaping Pandas Dataframe with Grouped Data (Long to Wide)
You can use pivot
and new columns are last value of column entity_id
extracted by indexing with str:
df = pd.pivot(index=df.group_id, columns=df.entity_id.str[-1], values=df.value) .add_prefix('entity_') .rename_axis(None, axis=1) .reset_index()print (df) group_id entity_1 entity_2 entity_30 A 5.0 3.0 2.01 B 10.0 8.0 11.02 C 2.0 6.0 NaN
Solution with cumcount
:
df = pd.pivot(index=df.group_id, columns=df.groupby('group_id').cumcount() + 1, values=df.value) .add_prefix('entity_') .reset_index()print (df) group_id entity_1 entity_2 entity_30 A 5.0 3.0 2.01 B 10.0 8.0 11.02 C 2.0 6.0 NaN
Another solution with groupby
and apply
, last reshape by unstack
:
df = df.groupby("group_id")["value"] .apply(lambda x: pd.Series(x.values)) .unstack() .add_prefix('entity_') .reset_index()print (df) group_id entity_0 entity_1 entity_20 A 5.0 3.0 2.01 B 10.0 8.0 11.02 C 2.0 6.0 NaN
If need count from 1
:
df = df.groupby("group_id")["value"].apply(lambda x: pd.Series(x.values)) .unstack() .rename(columns = lambda x: x+1) .add_prefix('entity_') .reset_index()print (df) group_id entity_1 entity_2 entity_30 A 5.0 3.0 2.01 B 10.0 8.0 11.02 C 2.0 6.0 NaN