Pandas DataFrame: transforming frame using unique values of a column Pandas DataFrame: transforming frame using unique values of a column pandas pandas

Pandas DataFrame: transforming frame using unique values of a column


Probably not the most elegant way possible, but using unstack:

>>> df         date Country Type  Val0  2013-01-01     USA    x   231  2013-01-01     USA    y   132  2013-01-01      MX    x   113  2013-01-01      MX    y   144  2013-01-02     USA    x   205  2013-01-02     USA    y   196  2013-01-02      MX    x   147  2013-01-02      MX    y   16>>> df.set_index(['date', 'Country', 'Type']).unstack('Type').reset_index()            date Country  ValType                        x   y0     2013-01-01      MX   11  141     2013-01-01     USA   23  132     2013-01-02      MX   14  163     2013-01-02     USA   20  19

A little more generally, and removing the strange hierarchical columns in the result:

>>> cols = [c for c in df.columns if c not in {'Type', 'Val'}]>>> df2 = df.set_index(cols + ['Type']).unstack('Type')>>> df2                    ValType                  x   ydate       Country2013-01-01 MX        11  14           USA       23  132013-01-02 MX        14  16           USA       20  19>>> df2.columns = df2.columns.levels[1]>>> df2.columns.name = None>>> df2                     x   ydate       Country2013-01-01 MX       11  14           USA      23  132013-01-02 MX       14  16           USA      20  19>>> df2.reset_index()         date Country   x   y0  2013-01-01      MX  11  141  2013-01-01     USA  23  132  2013-01-02      MX  14  163  2013-01-02     USA  20  19


I cooked up my own pivot based solution to the same problem before finding Dougal's answer, thought I would post it for posterity since I find it more readable:

>>> pd.__version__'0.15.0'>>> df         date Country Type  Val0  2013-01-01     USA    x   231  2013-01-01     USA    y   132  2013-01-01      MX    x   113  2013-01-01      MX    y   144  2013-01-02     USA    x   205  2013-01-02     USA    y   196  2013-01-02      MX    x   147  2013-01-02      MX    y   16>>> pt=df.pivot_table(values='Val',...                   columns='Type',...                   index=['date','Country'],...                   )>>> ptType                 x   ydate       Country        2013-01-01 MX       11  14           USA      23  132013-01-02 MX       14  16           USA      20  19

And then carry on with Dougal's cleanups:

>>> pt.columns.name=None>>> pt.reset_index()         date Country   x   y0  2013-01-01      MX  11  141  2013-01-01     USA  23  132  2013-01-02      MX  14  163  2013-01-02     USA  20  19

Note that DataFrame.to_csv() produces your requested output:

>>> print(pt.to_csv())date,Country,x,y2013-01-01,MX,11,142013-01-01,USA,23,132013-01-02,MX,14,162013-01-02,USA,20,19


Let's store your original dataframe in dfThen at least in version 0.18.1 you can do:

df.pivot_table(values="Val", index=['date', 'Country'], columns='Type')

gives the right answer:

Type                 x   ydate       Country2013-01-01 MX       11  14           USA      23  132013-01-02 MX       14  16           USA      20  19