I want to multiply two columns in a pandas DataFrame and add the result into a new column I want to multiply two columns in a pandas DataFrame and add the result into a new column pandas pandas

I want to multiply two columns in a pandas DataFrame and add the result into a new column


I think an elegant solution is to use the where method (also see the API docs):

In [37]: values = df.Prices * df.AmountIn [38]: df['Values'] = values.where(df.Action == 'Sell', other=-values)In [39]: dfOut[39]:    Prices  Amount Action  Values0       3      57   Sell     1711      89      42   Sell    37382      45      70    Buy   -31503       6      43   Sell     2584      60      47   Sell    28205      19      16    Buy    -3046      56      89   Sell    49847       3      28    Buy     -848      56      69   Sell    38649      90      49    Buy   -4410

Further more this should be the fastest solution.


You can use the DataFrame apply method:

order_df['Value'] = order_df.apply(lambda row: (row['Prices']*row['Amount']                                               if row['Action']=='Sell'                                               else -row['Prices']*row['Amount']),                                   axis=1)

It is usually faster to use these methods rather than over for loops.


If we're willing to sacrifice the succinctness of Hayden's solution, one could also do something like this:

In [22]: orders_df['C'] = orders_df.Action.apply(               lambda x: (1 if x == 'Sell' else -1))In [23]: orders_df   # New column C represents the sign of the transactionOut[23]:   Prices  Amount Action  C0       3      57   Sell  11      89      42   Sell  12      45      70    Buy -13       6      43   Sell  14      60      47   Sell  15      19      16    Buy -16      56      89   Sell  17       3      28    Buy -18      56      69   Sell  19      90      49    Buy -1

Now we have eliminated the need for the if statement. Using DataFrame.apply(), we also do away with the for loop. As Hayden noted, vectorized operations are always faster.

In [24]: orders_df['Value'] = orders_df.Prices * orders_df.Amount * orders_df.CIn [25]: orders_df   # The resulting dataframeOut[25]:   Prices  Amount Action  C  Value0       3      57   Sell  1    1711      89      42   Sell  1   37382      45      70    Buy -1  -31503       6      43   Sell  1    2584      60      47   Sell  1   28205      19      16    Buy -1   -3046      56      89   Sell  1   49847       3      28    Buy -1    -848      56      69   Sell  1   38649      90      49    Buy -1  -4410

This solution takes two lines of code instead of one, but is a bit easier to read. I suspect that the computational costs are similar as well.