Pandas Dataframe: Replacing NaN with row average

As commented the axis argument to fillna is NotImplemented.

df.fillna(df.mean(axis=1), axis=1)

Note: this would be critical here as you don't want to fill in your nth columns with the nth row average.

For now you'll need to iterate through:

In [11]: m = df.mean(axis=1)         for i, col in enumerate(df):             # using i allows for duplicate columns             # inplace *may* not always work here, so IMO the next line is preferred             # df.iloc[:, i].fillna(m, inplace=True)             df.iloc[:, i] = df.iloc[:, i].fillna(m)In [12]: dfOut[12]:   c1  c2   c30   1   4  7.01   2   5  3.52   3   6  9.0

An alternative is to fillna the transpose and then transpose, which may be more efficient...

df.T.fillna(df.mean(axis=1)).T

python pandas dataframe missing-data

As an alternative, you could also use an apply with a lambda expression like this:

df.apply(lambda row: row.fillna(row.mean()), axis=1)

yielding also

    c1   c2   c30  1.0  4.0  7.01  2.0  5.0  3.52  3.0  6.0  9.0

python pandas dataframe missing-data

I'll propose an alternative that involves casting into numpy arrays. Performance wise, I think this is more efficient and probably scales better than the other proposed solutions so far.

The idea being to use an indicator matrix (df.isna().values which is 1 if the element is N/A, 0 otherwise) and broadcast-multiplying that to the row averages.Thus, we end up with a matrix (exactly the same shape as the original df), which contains the row-average value if the original element was N/A, and 0 otherwise.

We add this matrix to the original df, making sure to fillna with 0 so that, in effect, we have filled the N/A's with the respective row averages.

# setup codedf = pd.DataFrame()df['c1'] = [1, 2, 3]df['c2'] = [4, 5, 6]df['c3'] = [7, np.nan, 9]# fillna row-wiserow_avgs = df.mean(axis=1).values.reshape(-1,1)df = df.fillna(0) + df.isna().values * row_avgsdf

giving

    c1   c2   c30   1.0  4.0  7.01   2.0  5.0  3.52   3.0  6.0  9.0

CodeHunter

Pandas Dataframe: Replacing NaN with row average

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last