I want to multiply two columns in a pandas DataFrame and add the result into a new column
I think an elegant solution is to use the where
method (also see the API docs
):
In [37]: values = df.Prices * df.AmountIn [38]: df['Values'] = values.where(df.Action == 'Sell', other=-values)In [39]: dfOut[39]: Prices Amount Action Values0 3 57 Sell 1711 89 42 Sell 37382 45 70 Buy -31503 6 43 Sell 2584 60 47 Sell 28205 19 16 Buy -3046 56 89 Sell 49847 3 28 Buy -848 56 69 Sell 38649 90 49 Buy -4410
Further more this should be the fastest solution.
You can use the DataFrame apply
method:
order_df['Value'] = order_df.apply(lambda row: (row['Prices']*row['Amount'] if row['Action']=='Sell' else -row['Prices']*row['Amount']), axis=1)
It is usually faster to use these methods rather than over for loops.
If we're willing to sacrifice the succinctness of Hayden's solution, one could also do something like this:
In [22]: orders_df['C'] = orders_df.Action.apply( lambda x: (1 if x == 'Sell' else -1))In [23]: orders_df # New column C represents the sign of the transactionOut[23]: Prices Amount Action C0 3 57 Sell 11 89 42 Sell 12 45 70 Buy -13 6 43 Sell 14 60 47 Sell 15 19 16 Buy -16 56 89 Sell 17 3 28 Buy -18 56 69 Sell 19 90 49 Buy -1
Now we have eliminated the need for the if
statement. Using DataFrame.apply()
, we also do away with the for
loop. As Hayden noted, vectorized operations are always faster.
In [24]: orders_df['Value'] = orders_df.Prices * orders_df.Amount * orders_df.CIn [25]: orders_df # The resulting dataframeOut[25]: Prices Amount Action C Value0 3 57 Sell 1 1711 89 42 Sell 1 37382 45 70 Buy -1 -31503 6 43 Sell 1 2584 60 47 Sell 1 28205 19 16 Buy -1 -3046 56 89 Sell 1 49847 3 28 Buy -1 -848 56 69 Sell 1 38649 90 49 Buy -1 -4410
This solution takes two lines of code instead of one, but is a bit easier to read. I suspect that the computational costs are similar as well.