How to duplicate rows based on a counter column
You can use np.repeat()
import pandas as pdimport numpy as np# your data# ========================df x count0 d 21 e 32 f 2# processing# ==================================np.repeat(df.values, df['count'].values, axis=0)array([['d', 2], ['d', 2], ['e', 3], ['e', 3], ['e', 3], ['f', 2], ['f', 2]], dtype=object)pd.DataFrame(np.repeat(df.values, df['count'].values, axis=0), columns=['x', 'count']) x count0 d 21 d 22 e 33 e 34 e 35 f 26 f 2
You could use .loc
with repeat
like
In [295]: df.loc[df.index.repeat(df['count'])].reset_index(drop=True)Out[295]: x count0 d 21 d 22 e 33 e 34 e 35 f 26 f 2
Or, using pd.Series.repeat
you can
In [278]: df.set_index('x')['count'].repeat(df['count']).reset_index()Out[278]: x count0 d 21 d 22 e 33 e 34 e 35 f 26 f 2