Python: Removing Rows on Count condition

python pandas dataframe indexing counter

Here you go with filter

df.groupby('city').filter(lambda x : len(x)>3)Out[1743]:   city0  NYC1  NYC2  NYC3  NYC

Solution two transform

sub_df = df[df.groupby('city').city.transform('count')>3].copy() # add copy for future warning when you need to modify the sub df

python pandas dataframe indexing counter

This is one way using pd.Series.value_counts.

counts = df['city'].value_counts()res = df[~df['city'].isin(counts[counts < 5].index)]

counts is a pd.Series object. counts < 5 returns a Boolean series. We filter the counts series by the Boolean counts < 5 series (that's what the square brackets achieve). We then take the index of the resultant series to find the cities with < 5 counts. ~ is the negation operator.

Remember a series is a mapping between index and value. The index of a series does not necessarily contain unique values, but this is guaranteed with the output of value_counts.

python pandas dataframe indexing counter

I think you're looking for value_counts()

# Import the great and powerful pandasimport pandas as pd# Create some example datadf = pd.DataFrame({    'city': ['NYC', 'NYC', 'SYD', 'NYC', 'SEL', 'NYC', 'NYC']})# Get the count of each valuevalue_counts = df['city'].value_counts()# Select the values where the count is less than 3 (or 5 if you like)to_remove = value_counts[value_counts <= 3].index# Keep rows where the city column is not in to_removedf = df[~df.city.isin(to_remove)]

CodeHunter

Python: Removing Rows on Count condition

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last