How to apply a function to two columns of Pandas dataframe How to apply a function to two columns of Pandas dataframe python python

How to apply a function to two columns of Pandas dataframe


Here's an example using apply on the dataframe, which I am calling with axis = 1.

Note the difference is that instead of trying to pass two values to the function f, rewrite the function to accept a pandas Series object, and then index the Series to get the values needed.

In [49]: dfOut[49]:           0         10  1.000000  0.0000001 -0.494375  0.5709942  1.000000  0.0000003  1.876360 -0.2297384  1.000000  0.000000In [50]: def f(x):       ....:  return x[0] + x[1]     ....:  In [51]: df.apply(f, axis=1) #passes a Series object, row-wiseOut[51]: 0    1.0000001    0.0766192    1.0000003    1.6466224    1.000000

Depending on your use case, it is sometimes helpful to create a pandas group object, and then use apply on the group.


There is a clean, one-line way of doing this in Pandas:

df['col_3'] = df.apply(lambda x: f(x.col_1, x.col_2), axis=1)

This allows f to be a user-defined function with multiple input values, and uses (safe) column names rather than (unsafe) numeric indices to access the columns.

Example with data (based on original question):

import pandas as pddf = pd.DataFrame({'ID':['1', '2', '3'], 'col_1': [0, 2, 3], 'col_2':[1, 4, 5]})mylist = ['a', 'b', 'c', 'd', 'e', 'f']def get_sublist(sta,end):    return mylist[sta:end+1]df['col_3'] = df.apply(lambda x: get_sublist(x.col_1, x.col_2), axis=1)

Output of print(df):

  ID  col_1  col_2      col_30  1      0      1     [a, b]1  2      2      4  [c, d, e]2  3      3      5  [d, e, f]

If your column names contain spaces or share a name with an existing dataframe attribute, you can index with square brackets:

df['col_3'] = df.apply(lambda x: f(x['col 1'], x['col 2']), axis=1)


A simple solution is:

df['col_3'] = df[['col_1','col_2']].apply(lambda x: f(*x), axis=1)


matomo