How to delete rows from a pandas DataFrame based on a conditional expression [duplicate]

python pandas

To directly answer this question's original title "How to delete rows from a pandas DataFrame based on a conditional expression" (which I understand is not necessarily the OP's problem but could help other users coming across this question) one way to do this is to use the drop method:

df = df.drop(some labels)df = df.drop(df[<some boolean condition>].index)

Example

To remove all rows where column 'score' is < 50:

df = df.drop(df[df.score < 50].index)

In place version (as pointed out in comments)

df.drop(df[df.score < 50].index, inplace=True)

Multiple conditions

(see Boolean Indexing)

The operators are: | for or, & for and, and ~ for not. These must begrouped by using parentheses.

To remove all rows where column 'score' is < 50 and > 20

df = df.drop(df[(df.score < 50) & (df.score > 20)].index)

python pandas

When you do len(df['column name']) you are just getting one number, namely the number of rows in the DataFrame (i.e., the length of the column itself). If you want to apply len to each element in the column, use df['column name'].map(len). So try

df[df['column name'].map(len) < 2]

python pandas

You can assign the DataFrame to a filtered version of itself:

df = df[df.score > 50]

This is faster than drop:

%%timeittest = pd.DataFrame({'x': np.random.randn(int(1e6))})test = test[test.x < 0]# 54.5 ms ± 2.02 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)%%timeittest = pd.DataFrame({'x': np.random.randn(int(1e6))})test.drop(test[test.x > 0].index, inplace=True)# 201 ms ± 17.9 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)%%timeittest = pd.DataFrame({'x': np.random.randn(int(1e6))})test = test.drop(test[test.x > 0].index)# 194 ms ± 7.03 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

CodeHunter

How to delete rows from a pandas DataFrame based on a conditional expression [duplicate]

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last