IF ELSE using Numpy and Pandas IF ELSE using Numpy and Pandas numpy numpy

IF ELSE using Numpy and Pandas


You almost had it, you needed to pass a raw string (prepend with r) to contains so it treats it as a regex:

In [115]:df['NewColumn'] = np.where(df.ICD.str.contains(r'313.2|414.2').astype(int), 1, np.where(((df.ICD.str.contains(r'313.2|414.2').astype(int))&(df['Date1']>df['Date2'])), 2, 0))dfOut[115]:   PatientID      Date1      Date2                ICD  NewColumn0       1234 2010-12-14 2010-12-12  313.2,414.2,228.1          11       3213 2010-08-02 2012-09-05        232.1,221.0          0

You get 1 returned because it short circuits on the first condition because that is met, if you want to get 2 returned then you need to rearrange the order of evaluation:

In [122]:df['NewColumn'] = np.where( (df.ICD.str.contains(r'313.2|414.2').astype(int)) & ( df['Date1'] > df['Date2'] ), 2 ,                            np.where( df.ICD.str.contains(r'313.2|414.2').astype(int), 1, 0 ) )dfOut[122]:   PatientID      Date1      Date2                ICD  NewColumn0       1234 2010-12-14 2010-12-12  313.2,414.2,228.1          21       3213 2010-08-02 2012-09-05        232.1,221.0          0


It is much easier to use the pandas functionality itself. Using numpy to do something that pandas already does is a good way to get unexpected behaviour.

Assuming you want to check for a cell value containing 313.2 only (so 2313.25 returns False).

df['ICD'].astype(str) == '313.2'

returns a Series Object of True or False next to each index entry.

so

 boolean =(df['ICD'].astype(str) == '313.2')| (df['ICD'].astype(str) == '414.2')if(boolean.any()):    #do something    return 1 boolean2 =((df['ICD'].astype(str) == '313.2')| (df['ICD'].astype(str) == '414.2'))&(df['Date1']>df['Date2'])if(boolean2.any()):     return 2

etc

Pandas also has the function isin() which can simplify things further.

The docs are here: http://pandas.pydata.org/pandas-docs/stable/indexing.html

Also, you do not return two because of the order you evaluate the conditional statements.In any circumstance where condition 2 evaluates as true, condition 1 must evaluate to be true also. So as you test condition 1 too, it always returns 1 or passes.

In short, you need to test condition 2 first, as there is no circumstance where 1 can be false and 2 can be true.