Pandas replace with default value Pandas replace with default value pandas pandas

Pandas replace with default value


You can use map rather as replace, because faster, then fillna by 3 and cast to int by astype:

df['col'] = df.col.map({'Mr': 0, 'Mrs': 1, 'Miss': 2}).fillna(3).astype(int)print (df)   col0    01    22    03    14    3

Another solution with numpy.where and condition with isin:

d = {'Mr': 0, 'Mrs': 1, 'Miss': 2}df['col'] = np.where(df.col.isin(d.keys()), df.col.map(d), 3).astype(int)print (df)   col0    01    22    03    14    3

Solution with replace:

d = {'Mr': 0, 'Mrs': 1, 'Miss': 2}df['col'] = np.where(df.col.isin(d.keys()), df.col.replace(d), 3)print (df)   col0    01    22    03    14    3

Timings:

df = pd.concat([df]*10000).reset_index(drop=True)d = {'Mr': 0, 'Mrs': 1, 'Miss': 2}df['col0'] = df.col.map(d).fillna(3).astype(int)df['col1'] = np.where(df.col.isin(d.keys()), df.col.replace(d), 3)df['col2'] = np.where(df.col.isin(d.keys()), df.col.map(d), 3).astype(int)print (df)In [447]: %timeit df['col0'] = df.col.map(d).fillna(3).astype(int)100 loops, best of 3: 4.93 ms per loopIn [448]: %timeit df['col1'] = np.where(df.col.isin(d.keys()), df.col.replace(d), 3)100 loops, best of 3: 14.3 ms per loopIn [449]: %timeit df['col2'] = np.where(df.col.isin(d.keys()), df.col.map(d), 3).astype(int)100 loops, best of 3: 7.68 ms per loopIn [450]: %timeit df['col3'] = df.col.map(lambda L: d.get(L, 3))10 loops, best of 3: 36.2 ms per loop


To add on the answer by @jezrael: The most straight forward solution is to use a defaultdict instead of dict. This is especially useful when you want missing values not to be replaced with your default value.

from collections import defaultdictdf['col'] = df.col.map(defaultdict(lambda: 3,Mr= 0, Mrs= 1, Miss= 2),na_action='ignore')

The first argument of defaultdict is a function that return the default value.