Randomly insert NA's values in a pandas dataframe

python pandas numpy missing-data

Here's a way to clear exactly 10% of cells (or rather, as close to 10% as can be achieved with the existing data frame's size).

import randomix = [(row, col) for row in range(df.shape[0]) for col in range(df.shape[1])]for row, col in random.sample(ix, int(round(.1*len(ix)))):    df.iat[row, col] = np.nan

Here's a way to clear cells independently with a per-cell probability of 10%.

df = df.mask(np.random.random(df.shape) < .1)

python pandas numpy missing-data

I think you can easily iterate over data frame columns and assign NaN value to every cell produced by pandas.DataFrame.sample() method.

The code is following.

for col in df.columns:    df.loc[df.sample(frac=0.1).index, col] = pd.np.nan

python pandas numpy missing-data

To add to and modify @Jaroslav Bezděk's code a bit, here is my view. Here, I am assuming that you want to apply the NaNs to numeric variables.

# select only numeric columns to apply the missingness tocols_list = df.select_dtypes('number').columns.tolist()        # randomly remove cases from the dataframefor col in df[cols_list]:    df.loc[df.sample(frac=0.05).index, col] = np.nan

Note: if you use pd.np.nan you get ipython-input-5-e9827aa92133>:9: FutureWarning: The pandas.np module is deprecated and will be removed from pandas in a future version. Import numpy directly instead.

CodeHunter

Randomly insert NA's values in a pandas dataframe

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last