Split a Pandas column of lists into multiple columns Split a Pandas column of lists into multiple columns python python

Split a Pandas column of lists into multiple columns


You can use the DataFrame constructor with lists created by to_list:

import pandas as pdd1 = {'teams': [['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG'],                ['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG']]}df2 = pd.DataFrame(d1)print (df2)       teams0  [SF, NYG]1  [SF, NYG]2  [SF, NYG]3  [SF, NYG]4  [SF, NYG]5  [SF, NYG]6  [SF, NYG]

df2[['team1','team2']] = pd.DataFrame(df2.teams.tolist(), index= df2.index)print (df2)       teams team1 team20  [SF, NYG]    SF   NYG1  [SF, NYG]    SF   NYG2  [SF, NYG]    SF   NYG3  [SF, NYG]    SF   NYG4  [SF, NYG]    SF   NYG5  [SF, NYG]    SF   NYG6  [SF, NYG]    SF   NYG

And for a new DataFrame:

df3 = pd.DataFrame(df2['teams'].to_list(), columns=['team1','team2'])print (df3)  team1 team20    SF   NYG1    SF   NYG2    SF   NYG3    SF   NYG4    SF   NYG5    SF   NYG6    SF   NYG

A solution with apply(pd.Series) is very slow:

#7k rowsdf2 = pd.concat([df2]*1000).reset_index(drop=True)In [121]: %timeit df2['teams'].apply(pd.Series)1.79 s ± 52.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)In [122]: %timeit pd.DataFrame(df2['teams'].to_list(), columns=['team1','team2'])1.63 ms ± 54.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


Much simpler solution:

pd.DataFrame(df2["teams"].to_list(), columns=['team1', 'team2'])

Yields,

  team1 team2-------------0    SF   NYG1    SF   NYG2    SF   NYG3    SF   NYG4    SF   NYG5    SF   NYG6    SF   NYG7    SF   NYG

If you wanted to split a column of delimited strings rather than lists, you could similarly do:

pd.DataFrame(df["teams"].str.split('<delim>', expand=True).values,             columns=['team1', 'team2'])


This solution preserves the index of the df2 DataFrame, unlike any solution that uses tolist():

df3 = df2.teams.apply(pd.Series)df3.columns = ['team1', 'team2']

Here's the result:

  team1 team20    SF   NYG1    SF   NYG2    SF   NYG3    SF   NYG4    SF   NYG5    SF   NYG6    SF   NYG