In Python Pandas, how to use like R dplyr mutate_each

With Pandas, this can be accomplished in a more lenghty way.

First, let's prepare the data:

import pandas as pdimport numpy as npfrom sklearn.datasets import load_irisiris_data = load_iris()iris = pd.DataFrame(iris_data.data, columns = [c[0:3] + c[6] for c in iris_data.feature_names])iris['Species'] = iris_data.target_names[iris_data.target]

Now we can imitate the mutate_each pipeline:

# calculate the aggregatespivot = iris.groupby("Species")[iris.columns[iris.columns.str.startswith('sepal')]                               ].aggregate(['min', 'max', np.mean])# name the aggregatespivot.columns = pivot.columns.get_level_values(0) + pivot.columns.get_level_values(1)# merge aggregates with the original dataframenew_iris = iris.merge(pivot, left_on='Species', right_index=True)

The pivot table is really a small pivot table:

            seplmin  seplmax  seplmean  sepwmin  sepwmax  sepwmeanSpecies                                                           setosa          4.3      5.8     5.006      2.3      4.4     3.418versicolor      4.9      7.0     5.936      2.0      3.4     2.770virginica       4.9      7.9     6.588      2.2      3.8     2.974

And the new_iris is a 150x11 table with all columns from iris and pivot combined, identical to what dplyr outputs.

python r pandas dplyr

mutate_each is superseded by mutate and across.

You can try this in python:

>>> from datar.all import f, group_by, starts_with, mutate, across, max, min, mean>>> from datar.datasets import iris>>> >>> iris >> \...    group_by(f.Species) >> \...    mutate(across(starts_with("Sepal"), [min, max, mean]))     Sepal_Length  Sepal_Width  Petal_Length  Petal_Width    Species  Sepal_Length_1  Sepal_Length_2  Sepal_Length_3  Sepal_Width_1  Sepal_Width_2  Sepal_Width_3        <float64>    <float64>     <float64>    <float64>   <object>       <float64>       <float64>       <float64>      <float64>      <float64>      <float64>0             5.1          3.5           1.4          0.2     setosa             4.3             5.8           5.006            2.3            4.4          3.4281             4.9          3.0           1.4          0.2     setosa             4.3             5.8           5.006            2.3            4.4          3.4282             4.7          3.2           1.3          0.2     setosa             4.3             5.8           5.006            2.3            4.4          3.4283             4.6          3.1           1.5          0.2     setosa             4.3             5.8           5.006            2.3            4.4          3.428..            ...          ...           ...          ...        ...             ...             ...             ...            ...            ...            ...4             5.0          3.6           1.4          0.2     setosa             4.3             5.8           5.006            2.3            4.4          3.428145           6.7          3.0           5.2          2.3  virginica             4.9             7.9           6.588            2.2            3.8          2.974146           6.3          2.5           5.0          1.9  virginica             4.9             7.9           6.588            2.2            3.8          2.974147           6.5          3.0           5.2          2.0  virginica             4.9             7.9           6.588            2.2            3.8          2.974148           6.2          3.4           5.4          2.3  virginica             4.9             7.9           6.588            2.2            3.8          2.974149           5.9          3.0           5.1          1.8  virginica             4.9             7.9           6.588            2.2            3.8          2.974[Groups: Species (n=3)][150 rows x 11 columns]

I am the author of the datar package.

CodeHunter

In Python Pandas, how to use like R dplyr mutate_each

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last