Create complicated conditional column (geometric mean) Python

python pandas

This calculates the geometric mean of each site and checks if it is greater than 30:

>>> df['geo_mean_acceptable'] = (        df.groupby('Site')          .transform(lambda group: group.prod() ** (1 / float(len(group))) > 30)          .astype(bool))

And this gets the geometric mean of each site:

>>> df.groupby('Site').EnteroCount.apply(lambda group: group.product() ** (1 / float(len(group))))SiteA     68.016702B    121.981006C    180.000000Name: EnteroCount, dtype: float64

Using the geometric mean function from scipy:

from scipy.stats.mstats import gmean>>> df.groupby('Site').EnteroCount.apply(gmean)SiteA     68.016702B    121.981006C    180.000000Name: EnteroCount, dtype: float64

Given that the five highest values will give you the highest geometric mean in a group, you can use this:

df.groupby('Site').EnteroCount.apply(lambda group: gmean(group.nlargest(5)))

You can see how it is selecting the largest five values by group, which then get used as parameters for gmean:

>>> df.groupby('Site').EnteroCount.apply(lambda group: group.nlargest(5).values.tolist())SiteA    [1733, 150, 70, 20, 4]B            [1500, 55, 22]C                     [180]Name: EnteroCount, dtype: object

Summary

df['swim'] = np.where(    (df.groupby('Site').EnteroCount.transform(max) > 110) |    (df.groupby('Site').EnteroCount.transform(lambda group: gmean(group.nlargest(5))) > 30),     'unacceptable', 'acceptable')

CodeHunter

Create complicated conditional column (geometric mean) Python

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last