Pandas equivalent to SQL window functions
For the first SQL:
SELECT state_name, state_population, SUM(state_population) OVER() AS national_populationFROM population ORDER BY state_name
Pandas:
df.assign(national_population=df.state_population.sum()).sort_values('state_name')
For the second SQL:
SELECT state_name, state_population, region, SUM(state_population) OVER(PARTITION BY region) AS regional_populationFROM population ORDER BY state_name
Pandas:
df.assign(regional_population=df.groupby('region')['state_population'].transform('sum')) \ .sort_values('state_name')
DEMO:
In [238]: dfOut[238]: region state_name state_population0 1 aaa 1001 1 bbb 1102 2 ccc 2003 2 ddd 1004 2 eee 1005 3 xxx 55
national_population:
In [246]: df.assign(national_population=df.state_population.sum()).sort_values('state_name')Out[246]: region state_name state_population national_population0 1 aaa 100 6651 1 bbb 110 6652 2 ccc 200 6653 2 ddd 100 6654 2 eee 100 6655 3 xxx 55 665
regional_population:
In [239]: df.assign(regional_population=df.groupby('region')['state_population'].transform('sum')) \ ...: .sort_values('state_name')Out[239]: region state_name state_population regional_population0 1 aaa 100 2101 1 bbb 110 2102 2 ccc 200 4003 2 ddd 100 4004 2 eee 100 4005 3 xxx 55 55