parsing a dictionary in a pandas dataframe cell into new row cells (new columns) parsing a dictionary in a pandas dataframe cell into new row cells (new columns) pandas pandas

parsing a dictionary in a pandas dataframe cell into new row cells (new columns)


consider df

df = pd.DataFrame([        ['a', 'b', 'c', 'd', dict(F='y', G='v')],        ['a', 'b', 'c', 'd', dict(F='y', G='v')],    ], columns=list('ABCDE'))df   A  B  C  D                     E0  a  b  c  d  {'F': 'y', 'G': 'v'}1  a  b  c  d  {'F': 'y', 'G': 'v'}

Option 1
Use pd.Series.apply, assign new columns in place

df.E.apply(pd.Series)   F  G0  y  v1  y  v

Assign it like this

df[['F', 'G']] = df.E.apply(pd.Series)df.drop('E', axis=1)   A  B  C  D  F  G0  a  b  c  d  y  v1  a  b  c  d  y  v

Option 2
Pipeline the whole thing using the pd.DataFrame.assign method

df.drop('E', 1).assign(**pd.DataFrame(df.E.values.tolist()))   A  B  C  D  F  G0  a  b  c  d  y  v1  a  b  c  d  y  v


I think you can use concat:

df = pd.DataFrame({1:['a','h'],2:['b','h'], 5:[{6:'y', 7:'v'},{6:'u', 7:'t'}] })print (df)   1  2                 50  a  b  {6: 'y', 7: 'v'}1  h  h  {6: 'u', 7: 't'}print (df.loc[:,5].values.tolist())[{6: 'y', 7: 'v'}, {6: 'u', 7: 't'}]df1 = pd.DataFrame(df.loc[:,5].values.tolist())print (df1)   6  70  y  v1  u  tprint (pd.concat([df, df1], axis=1))   1  2                 5  6  70  a  b  {6: 'y', 7: 'v'}  y  v1  h  h  {6: 'u', 7: 't'}  u  t

Timings (len(df)=2k):

In [2]: %timeit (pd.concat([df, pd.DataFrame(df.loc[:,5].values.tolist())], axis=1))100 loops, best of 3: 2.99 ms per loopIn [3]: %timeit (pir(df))1 loop, best of 3: 625 ms per loopdf = pd.concat([df]*1000).reset_index(drop=True)print (pd.concat([df, pd.DataFrame(df.loc[:,5].values.tolist())], axis=1))def pir(df):    df[['F', 'G']] = df[5].apply(pd.Series)    df.drop(5, axis=1)    return dfprint (pir(df))