Pandas DataFrame.groupby() to dictionary with multiple columns for value Pandas DataFrame.groupby() to dictionary with multiple columns for value pandas pandas

Pandas DataFrame.groupby() to dictionary with multiple columns for value


Customize the function you use in apply so it returns a list of lists for each group:

df.groupby('Column1')[['Column2', 'Column3']].apply(lambda g: g.values.tolist()).to_dict()# {0: [[23, 1]], #  1: [[5, 2], [2, 3], [19, 5]], #  2: [[56, 1], [22, 2]], #  3: [[2, 4], [14, 5]], #  4: [[59, 1]], #  5: [[44, 1], [1, 2], [87, 3]]}

If you need a list of tuples explicitly, use list(map(tuple, ...)) to convert:

df.groupby('Column1')[['Column2', 'Column3']].apply(lambda g: list(map(tuple, g.values.tolist()))).to_dict()# {0: [(23, 1)], #  1: [(5, 2), (2, 3), (19, 5)], #  2: [(56, 1), (22, 2)], #  3: [(2, 4), (14, 5)], #  4: [(59, 1)], #  5: [(44, 1), (1, 2), (87, 3)]}


One way is to create a new tup column and then create the dictionary.

df['tup'] = list(zip(df['Column2'], df['Column3']))df.groupby('Column1')['tup'].apply(list).to_dict()# {0: [(23, 1)],#  1: [(5, 2), (2, 3), (19, 5)],#  2: [(56, 1), (22, 2)],#  3: [(2, 4), (14, 5)],#  4: [(59, 1)],#  5: [(44, 1), (1, 2), (87, 3)]}

@Psidom's solution is more efficient, but if performance isn't an issue use what makes more sense to you:

df = pd.concat([df]*10000)def jp(df):    df['tup'] = list(zip(df['Column2'], df['Column3']))    return df.groupby('Column1')['tup'].apply(list).to_dict()def psi(df):    return df.groupby('Column1')[['Column2', 'Column3']].apply(lambda g: list(map(tuple, g.values.tolist()))).to_dict()%timeit jp(df)   # 110ms%timeit psi(df)  # 80ms


I'd rather use defaultdict

from collections import defaultdictd = defaultdict(list)for row in df.values.tolist():    d[row[0]].append(tuple(row[1:]))dict(d){0: [(23, 1)], 1: [(5, 2), (2, 3), (19, 5)], 2: [(56, 1), (22, 2)], 3: [(2, 4), (14, 5)], 4: [(59, 1)], 5: [(44, 1), (1, 2), (87, 3)]}