Update row values where certain condition is met in pandas
I think you can use loc
if you need update two columns to same value:
df1.loc[df1['stream'] == 2, ['feat','another_feat']] = 'aaaa'print df1 stream feat another_feata 1 some_value some_valueb 2 aaaa aaaac 2 aaaa aaaad 3 some_value some_value
If you need update separate, one option is use:
df1.loc[df1['stream'] == 2, 'feat'] = 10print df1 stream feat another_feata 1 some_value some_valueb 2 10 some_valuec 2 10 some_valued 3 some_value some_value
Another common option is use numpy.where
:
df1['feat'] = np.where(df1['stream'] == 2, 10,20)print df1 stream feat another_feata 1 20 some_valueb 2 10 some_valuec 2 10 some_valued 3 20 some_value
EDIT: If you need divide all columns without stream
where condition is True
, use:
print df1 stream feat another_feata 1 4 5b 2 4 5c 2 2 9d 3 1 7#filter columns all without streamcols = [col for col in df1.columns if col != 'stream']print cols['feat', 'another_feat']df1.loc[df1['stream'] == 2, cols ] = df1 / 2print df1 stream feat another_feata 1 4.0 5.0b 2 2.0 2.5c 2 1.0 4.5d 3 1.0 7.0
If working with multiple conditions is possible use multiple numpy.where
or numpy.select
:
df0 = pd.DataFrame({'Col':[5,0,-6]})df0['New Col1'] = np.where((df0['Col'] > 0), 'Increasing', np.where((df0['Col'] < 0), 'Decreasing', 'No Change'))df0['New Col2'] = np.select([df0['Col'] > 0, df0['Col'] < 0], ['Increasing', 'Decreasing'], default='No Change')print (df0) Col New Col1 New Col20 5 Increasing Increasing1 0 No Change No Change2 -6 Decreasing Decreasing
You can do the same with .ix
, like this:
In [1]: df = pd.DataFrame(np.random.randn(5,4), columns=list('abcd'))In [2]: dfOut[2]: a b c d0 -0.323772 0.839542 0.173414 -1.3417931 -1.001287 0.676910 0.465536 0.2295442 0.963484 -0.905302 -0.435821 1.9345123 0.266113 -0.034305 -0.110272 -0.7205994 -0.522134 -0.913792 1.862832 0.314315In [3]: df.ix[df.a>0, ['b','c']] = 0In [4]: dfOut[4]: a b c d0 -0.323772 0.839542 0.173414 -1.3417931 -1.001287 0.676910 0.465536 0.2295442 0.963484 0.000000 0.000000 1.9345123 0.266113 0.000000 0.000000 -0.7205994 -0.522134 -0.913792 1.862832 0.314315
EDIT
After the extra information, the following will return all columns - where some condition is met - with halved values:
>> condition = df.a > 0>> df[condition][[i for i in df.columns.values if i not in ['a']]].apply(lambda x: x/2)