Understanding inplace=True

When inplace=True is passed, the data is renamed in place (it returns nothing), so you'd use:

df.an_operation(inplace=True)

When inplace=False is passed (this is the default value, so isn't necessary), performs the operation and returns a copy of the object, so you'd use:

df = df.an_operation(inplace=False)

python pandas in-place

In pandas, is inplace = True considered harmful, or not?

TLDR; Yes, yes it is.

inplace, contrary to what the name implies, often does not prevent copies from being created, and (almost) never offers any performance benefits
inplace does not work with method chaining
inplace can lead to SettingWithCopyWarning if used on a DataFrame column, and may prevent the operation from going though, leading to hard-to-debug errors in code

The pain points above are common pitfalls for beginners, so removing this option will simplify the API.

I don't advise setting this parameter as it serves little purpose. See this GitHub issue which proposes the inplace argument be deprecated api-wide.

It is a common misconception that using inplace=True will lead to more efficient or optimized code. In reality, there are absolutely no performance benefits to using inplace=True. Both the in-place and out-of-place versions create a copy of the data anyway, with the in-place version automatically assigning the copy back.

inplace=True is a common pitfall for beginners. For example, it can trigger the SettingWithCopyWarning:

df = pd.DataFrame({'a': [3, 2, 1], 'b': ['x', 'y', 'z']})df2 = df[df['a'] > 1]df2['b'].replace({'x': 'abc'}, inplace=True)# SettingWithCopyWarning: # A value is trying to be set on a copy of a slice from a DataFrame

Calling a function on a DataFrame column with inplace=True may or may not work. This is especially true when chained indexing is involved.

As if the problems described above aren't enough, inplace=True also hinders method chaining. Contrast the working of

result = df.some_function1().reset_index().some_function2()

As opposed to

temp = df.some_function1()temp.reset_index(inplace=True)result = temp.some_function2()

The former lends itself to better code organization and readability.

Another supporting claim is that the API for set_axis was recently changed such that inplace default value was switched from True to False. See GH27600. Great job devs!

python pandas in-place

The way I use it is

# Have to assign back to dataframe (because it is a new copy)df = df.some_operation(inplace=False)

# No need to assign back to dataframe (because it is on the same copy)df.some_operation(inplace=True)

CONCLUSION:

 if inplace is False      Assign to a new variable; else      No need to assign

CodeHunter

Understanding inplace=True

In pandas, is inplace = True considered harmful, or not?

TLDR; Yes, yes it is.

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last