how to unstack (or pivot?) in pandas
We want values (e.g. 'GERMANY'
) to become column names, and column names (e.g. 'HOUR1'
) to become values -- a swap of sorts.
The stack
method turns column names into index values, and the unstack
method turns index values into column names.
So by shifting the values into the index, we can use stack
and unstack
to perform the swap.
import pandas as pddatelisttemp = pd.date_range('1/1/2014', periods=3, freq='D')s = list(datelisttemp)*3s.sort()df = pd.DataFrame({'BORDER':['GERMANY','FRANCE','ITALY','GERMANY','FRANCE','ITALY','GERMANY','FRANCE','ITALY' ], 'HOUR1':[2 ,2 ,2 ,4 ,4 ,4 ,6 ,6, 6],'HOUR2':[3 ,3 ,3, 5 ,5 ,5, 7, 7, 7], 'HOUR3':[8 ,8 ,8, 12 ,12 ,12, 99, 99, 99]}, index=s)df = df.set_index(['BORDER'], append=True)df.columns.name = 'HOUR'df = df.unstack('BORDER')df = df.stack('HOUR')df = df.reset_index('HOUR')df['HOUR'] = df['HOUR'].str.replace('HOUR', '').astype('int')print(df)
yields
BORDER HOUR FRANCE GERMANY ITALY2014-01-01 1 2 2 22014-01-01 2 3 3 32014-01-01 3 8 8 82014-01-02 1 4 4 42014-01-02 2 5 5 52014-01-02 3 12 12 122014-01-03 1 6 6 62014-01-03 2 7 7 72014-01-03 3 99 99 99
Using your df2
:
>>> df2.pivot_table(values='value', index=['DATE', 'variable'], columns="BORDER")BORDER FRANCE GERMANY ITALYDATE variable 2014-01-01 HOUR1 2 2 2 HOUR2 3 3 3 HOUR3 8 8 82014-01-02 HOUR1 4 4 4 HOUR2 5 5 5 HOUR3 12 12 122014-01-03 HOUR1 6 6 6 HOUR2 7 7 7 HOUR3 99 99 99[9 rows x 3 columns]
There is still a bit of cleanup to do if you want to convert the index level "variable" into a column called "HOUR" and strip out the text "HOUR" from the values, but I think that is the basic format you want.