Cross tab on one column where third column is matched
You can solve this using merge
and crosstab
:
u = df.reset_index()v = u.merge(u, on='id_match').query('index_x != index_y')r = pd.crosstab(v.demographic_x, v.demographic_y, v.time_x.astype(int) + v.time_y.astype(int), aggfunc='sum')print(r)demographic_y A B Cdemographic_x A NaN 52.0 NaNB 52.0 NaN NaNC NaN NaN 4.0
If you need the NaNs filled in with zeros, you can use fillna
:
r.fillna(0, downcast='infer')demographic_y A B Cdemographic_x A 0 52 0B 52 0 0C 0 0 4