How can I replicate rows in Pandas?
Use np.repeat
:
Version 1:
Try using np.repeat
:
newdf = pd.DataFrame(np.repeat(df.values, 3, axis=0))newdf.columns = df.columnsprint(newdf)
The above code will output:
Person ID ZipCode Gender0 12345 882 38182 Female1 12345 882 38182 Female2 12345 882 38182 Female3 32917 271 88172 Male4 32917 271 88172 Male5 32917 271 88172 Male6 18273 552 90291 Female7 18273 552 90291 Female8 18273 552 90291 Female
np.repeat
repeats the values of df
, 3
times.
Then we add the columns with assigning new_df.columns = df.columns
.
Version 2:
You could also assign the column names in the first line, like below:
newdf = pd.DataFrame(np.repeat(df.values, 3, axis=0), columns=df.columns)print(newdf)
The above code will also output:
Person ID ZipCode Gender0 12345 882 38182 Female1 12345 882 38182 Female2 12345 882 38182 Female3 32917 271 88172 Male4 32917 271 88172 Male5 32917 271 88172 Male6 18273 552 90291 Female7 18273 552 90291 Female8 18273 552 90291 Female
Using concat
:
pd.concat([df]*3).sort_index()Out[129]: Person ID ZipCode Gender0 12345 882 38182 Female0 12345 882 38182 Female0 12345 882 38182 Female1 32917 271 88172 Male1 32917 271 88172 Male1 32917 271 88172 Male2 18273 552 90291 Female2 18273 552 90291 Female2 18273 552 90291 Female