Pandas DataFrame: transforming frame using unique values of a column
Probably not the most elegant way possible, but using unstack:
>>> df date Country Type Val0 2013-01-01 USA x 231 2013-01-01 USA y 132 2013-01-01 MX x 113 2013-01-01 MX y 144 2013-01-02 USA x 205 2013-01-02 USA y 196 2013-01-02 MX x 147 2013-01-02 MX y 16>>> df.set_index(['date', 'Country', 'Type']).unstack('Type').reset_index() date Country ValType x y0 2013-01-01 MX 11 141 2013-01-01 USA 23 132 2013-01-02 MX 14 163 2013-01-02 USA 20 19
A little more generally, and removing the strange hierarchical columns in the result:
>>> cols = [c for c in df.columns if c not in {'Type', 'Val'}]>>> df2 = df.set_index(cols + ['Type']).unstack('Type')>>> df2 ValType x ydate Country2013-01-01 MX 11 14 USA 23 132013-01-02 MX 14 16 USA 20 19>>> df2.columns = df2.columns.levels[1]>>> df2.columns.name = None>>> df2 x ydate Country2013-01-01 MX 11 14 USA 23 132013-01-02 MX 14 16 USA 20 19>>> df2.reset_index() date Country x y0 2013-01-01 MX 11 141 2013-01-01 USA 23 132 2013-01-02 MX 14 163 2013-01-02 USA 20 19
I cooked up my own pivot based solution to the same problem before finding Dougal's answer, thought I would post it for posterity since I find it more readable:
>>> pd.__version__'0.15.0'>>> df date Country Type Val0 2013-01-01 USA x 231 2013-01-01 USA y 132 2013-01-01 MX x 113 2013-01-01 MX y 144 2013-01-02 USA x 205 2013-01-02 USA y 196 2013-01-02 MX x 147 2013-01-02 MX y 16>>> pt=df.pivot_table(values='Val',... columns='Type',... index=['date','Country'],... )>>> ptType x ydate Country 2013-01-01 MX 11 14 USA 23 132013-01-02 MX 14 16 USA 20 19
And then carry on with Dougal's cleanups:
>>> pt.columns.name=None>>> pt.reset_index() date Country x y0 2013-01-01 MX 11 141 2013-01-01 USA 23 132 2013-01-02 MX 14 163 2013-01-02 USA 20 19
Note that DataFrame.to_csv()
produces your requested output:
>>> print(pt.to_csv())date,Country,x,y2013-01-01,MX,11,142013-01-01,USA,23,132013-01-02,MX,14,162013-01-02,USA,20,19