How to create a dictionary of key : column_name and value : unique values in column in python from a dataframe How to create a dictionary of key : column_name and value : unique values in column in python from a dataframe pandas pandas

How to create a dictionary of key : column_name and value : unique values in column in python from a dataframe


With a DataFrame like this:

import pandas as pddf = pd.DataFrame([["Women", "Slip on", 7, "Black", "Clarks"], ["Women", "Slip on", 8, "Brown", "Clarcks"], ["Women", "Slip on", 7, "Blue", "Clarks"]], columns= ["Category", "Sub Category", "Size", "Color", "Brand"])print(df)

Output:

  Category Sub Category  Size  Color    Brand0    Women      Slip on     7  Black   Clarks1    Women      Slip on     8  Brown  Clarcks2    Women      Slip on     7   Blue   Clarks

You can convert your DataFrame into dict and create your new dict when mapping the the columns of the DataFrame, like this example:

new_dict = {"color_list": list(df["Color"]), "size_list": list(df["Size"])}# OR:#new_dict = {"color_list": [k for k in df["Color"]], "size_list": [k for k in df["Size"]]}print(new_dict)

Output:

{'color_list': ['Black', 'Brown', 'Blue'], 'size_list': [7, 8, 7]}

In order to have a unique values, you can use set like this example:

new_dict = {"color_list": list(set(df["Color"])), "size_list": list(set(df["Size"]))}print(new_dict)

Output:

{'color_list': ['Brown', 'Blue', 'Black'], 'size_list': [8, 7]}

Or, like what @Ami Tavory said in his answer, in order to have the whole unique keys and values from your DataFrame, you can simply do this:

new_dict = {k:list(df[k].unique()) for k in df.columns}print(new_dict)

Output:

{'Brand': ['Clarks', 'Clarcks'], 'Category': ['Women'], 'Color': ['Black', 'Brown', 'Blue'], 'Size': [7, 8], 'Sub Category': ['Slip on']}


I am trying to create a dictionary of key:value pairs where key is the column name of a dataframe and value will be a list containing all the unique values in that column.

You could use a simple dictionary comprehension for that.

Say you start with

import pandas as pddf = pd.DataFrame({'a': [1, 2, 1], 'b': [1, 4, 5]})

Then the following comprehension solves it:

>>> {c: list(df[c].unique()) for c in df.columns}{'a': [1, 2], 'b': [1, 4, 5]}


If I understand your question correctly, you may need set instead of list. Probably at this piece of code, you might add set to get the unique values of the given list.

for col in col_list[1:]:    _list = []    _list.append(footwear_data[col].unique())    list_name = ''.join([str(col),'_list'])    list_name = set(list_name)

Sample of usage

>>> a_list = [7, 8, 7, 9, 10, 9]>>> set(a_list)    {8, 9, 10, 7}