Conditional If Statement: If value in row contains string ... set another column equal to string
I assume you are using pandas
, then you can use numpy.where
, which is a vectorized version of if/else, with the condition constructed by str.contains
:
df['Activity_2'] = pd.np.where(df.Activity.str.contains("email"), "email", pd.np.where(df.Activity.str.contains("conference"), "conference", pd.np.where(df.Activity.str.contains("call"), "call", "task")))df# Activity Activity_2#0 email personA email#1 attend conference conference#2 send email email#3 call Sam call#4 random text task#5 random text task#6 lwantto call call
This also works:
df.loc[df['Activity'].str.contains('email'), 'Activity_2'] = 'email'df.loc[df['Activity'].str.contains('conference'), 'Activity_2'] = 'conference'df.loc[df['Activity'].str.contains('call'), 'Activity_2'] = 'call'
The current solution behaves wrongly if your df contains NaN values. In that case I recommend using the following code which worked for me
temp=df.Activity.fillna("0")df['Activity_2'] = pd.np.where(temp.str.contains("0"),"None", pd.np.where(temp.str.contains("email"), "email", pd.np.where(temp.str.contains("conference"), "conference", pd.np.where(temp.str.contains("call"), "call", "task"))))