specifying "skip NA" when calculating mean of the column in a data frame created by Pandas

python r pandas na

That's a trick question, since you don't do that. Pandas will automatically exclude NaN numbers from aggregation functions. Consider my df:

    b   c   d  ea               2   2   6   1  32   4   8 NaN  72   4   4   6  33   5 NaN   2  64 NaN NaN   4  15   6   2   1  87   3   2   4  79   6   1 NaN  19 NaN NaN   9  39   3   4   6  1

The internal count() function will ignore NaN values, and so will mean(). The only point where we get NaN, is when the only value is NaN. Then, we take the mean value of an empty set, which turns out to be NaN:

In[335]: df.groupby('a').mean()Out[333]:           b    c    d         ea                              2  3.333333  6.0  3.5  4.3333333  5.000000  NaN  2.0  6.0000004       NaN  NaN  4.0  1.0000005  6.000000  2.0  1.0  8.0000007  3.000000  2.0  4.0  7.0000009  4.500000  2.5  7.5  1.666667

Aggregate functions work in the same way:

In[340]: df.groupby('a')['b'].agg({'foo': np.mean})Out[338]:         fooa          2  3.3333333  5.0000004       NaN5  6.0000007  3.0000009  4.500000

Addendum: Notice how the standard dataframe.mean API will allow you to control inclusion of NaN values, where the default is exclusion.

python r pandas na

What foobar said is true in regards to how it was implemented by default, but there is a very easy way to specify skipna. Here is an exemple that speaks for itself:

def custom_mean(df):    return df.mean(skipna=False)group.agg({"your_col_name_to_be_aggregated":custom_mean})

That's it! You can customize your own aggregation the way you want, and I'd expect this to be fairly efficient, but I did not dig into it.

It was also discussed here, but I thought I'd help spread the good news!Answer was found in the official doc.

CodeHunter

specifying "skip NA" when calculating mean of the column in a data frame created by Pandas

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last