Count unique values per groups with Pandas [duplicate]

python pandas group-by unique pandas-groupby

You need nunique:

df = df.groupby('domain')['ID'].nunique()print (df)domain'facebook.com'    1'google.com'      1'twitter.com'     2'vk.com'          3Name: ID, dtype: int64

If you need to strip ' characters:

df = df.ID.groupby([df.domain.str.strip("'")]).nunique()print (df)domainfacebook.com    1google.com      1twitter.com     2vk.com          3Name: ID, dtype: int64

Or as Jon Clements commented:

df.groupby(df.domain.str.strip("'"))['ID'].nunique()

You can retain the column name like this:

df = df.groupby(by='domain', as_index=False).agg({'ID': pd.Series.nunique})print(df)    domain  ID0       fb   11      ggl   12  twitter   23       vk   3

The difference is that nunique() returns a Series and agg() returns a DataFrame.

python pandas group-by unique pandas-groupby

Generally to count distinct values in single column, you can use Series.value_counts:

df.domain.value_counts()#'vk.com'          5#'twitter.com'     2#'facebook.com'    1#'google.com'      1#Name: domain, dtype: int64

To see how many unique values in a column, use Series.nunique:

df.domain.nunique()# 4

To get all these distinct values, you can use unique or drop_duplicates, the slight difference between the two functions is that unique return a numpy.array while drop_duplicates returns a pandas.Series:

df.domain.unique()# array(["'vk.com'", "'twitter.com'", "'facebook.com'", "'google.com'"], dtype=object)df.domain.drop_duplicates()#0          'vk.com'#2     'twitter.com'#4    'facebook.com'#6      'google.com'#Name: domain, dtype: object

As for this specific problem, since you'd like to count distinct value with respect to another variable, besides groupby method provided by other answers here, you can also simply drop duplicates firstly and then do value_counts():

import pandas as pddf.drop_duplicates().domain.value_counts()# 'vk.com'          3# 'twitter.com'     2# 'facebook.com'    1# 'google.com'      1# Name: domain, dtype: int64

python pandas group-by unique pandas-groupby

df.domain.value_counts()

>>> df.domain.value_counts()vk.com          5twitter.com     2google.com      1facebook.com    1Name: domain, dtype: int64

CodeHunter

Count unique values per groups with Pandas [duplicate]

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last