Pandas equivalent to SQL window functions

For the first SQL:

SELECT state_name,         state_population,       SUM(state_population)        OVER() AS national_populationFROM population   ORDER BY state_name

Pandas:

df.assign(national_population=df.state_population.sum()).sort_values('state_name')

For the second SQL:

SELECT state_name,         state_population,       region,       SUM(state_population)        OVER(PARTITION BY region) AS regional_populationFROM population    ORDER BY state_name

Pandas:

df.assign(regional_population=df.groupby('region')['state_population'].transform('sum')) \  .sort_values('state_name')

DEMO:

In [238]: dfOut[238]:   region state_name  state_population0       1        aaa               1001       1        bbb               1102       2        ccc               2003       2        ddd               1004       2        eee               1005       3        xxx                55

national_population:

In [246]: df.assign(national_population=df.state_population.sum()).sort_values('state_name')Out[246]:   region state_name  state_population  national_population0       1        aaa               100                  6651       1        bbb               110                  6652       2        ccc               200                  6653       2        ddd               100                  6654       2        eee               100                  6655       3        xxx                55                  665

regional_population:

In [239]: df.assign(regional_population=df.groupby('region')['state_population'].transform('sum')) \     ...:   .sort_values('state_name')Out[239]:   region state_name  state_population  regional_population0       1        aaa               100                  2101       1        bbb               110                  2102       2        ccc               200                  4003       2        ddd               100                  4004       2        eee               100                  4005       3        xxx                55                   55

CodeHunter

Pandas equivalent to SQL window functions

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last