pandas apply function that returns multiple values to rows in pandas dataframe pandas apply function that returns multiple values to rows in pandas dataframe python python

pandas apply function that returns multiple values to rows in pandas dataframe


Return Series and it will put them in a DataFrame.

def myfunc(a, b, c):    do something    return pd.Series([e, f, g])

This has the bonus that you can give labels to each of the resulting columns. If you return a DataFrame it just inserts multiple rows for the group.


Based on the excellent answer by @U2EF1, I've created a handy function that applies a specified function that returns tuples to a dataframe field, and expands the result back to the dataframe.

def apply_and_concat(dataframe, field, func, column_names):    return pd.concat((        dataframe,        dataframe[field].apply(            lambda cell: pd.Series(func(cell), index=column_names))), axis=1)

Usage:

df = pd.DataFrame([1, 2, 3], index=['a', 'b', 'c'], columns=['A'])print df   Aa  1b  2c  3def func(x):    return x*x, x*x*xprint apply_and_concat(df, 'A', func, ['x^2', 'x^3'])   A  x^2  x^3a  1    1    1b  2    4    8c  3    9   27

Hope it helps someone.


I've tried returning a tuple (I was using functions like scipy.stats.pearsonr which return that kind of structures) but It returned a 1D Series instead of a Dataframe which was I expected. If I created a Series manually the performance was worse, so I fixed It using the result_type as explained in the official API documentation:

Returning a Series inside the function is similar to passing result_type='expand'. The resulting column names will be the Series index.

So you could edit your code this way:

def myfunc(a, b, c):    # do something    return (e, f, g)df.apply(myfunc, axis=1,  result_type='expand')