SQL Server - conditional aggregation with correlation
There is an easier solution:
SELECT c.Customer, c."User", c."Revenue", 1.0 * Revenue/ NULLIF(c2.sum_total, 0) AS percentage, 1.0 * c2.sum_running / NULLIF(c2.sum_total, 0) AS running_percentageFROM t c CROSS APPLY (SELECT SUM(c2.Revenue) AS sum_total, SUM(CASE WHEN c2.Revenue >= x.Revenue THEN c2.Revenue ELSE 0 END) as sum_running FROM t c2 CROSS JOIN (SELECT c.REVENUE) x WHERE c."User" = c2."User" ) c2ORDER BY "User", Revenue DESC;
I am not sure why or if this limitation is in the SQL '92 standard. I did have it pretty well memorized 20 or so years ago, but I don't recall that particular limitation.
I should note:
- At the time of the SQL 92 standard, lateral joins were not really on the radar. Sybase definitely had no such concept.
- Other databases do have problems with outer references. In particular, they often limit the scoping to one level deep.
- The SQL Standard itself tends to highly political (that is, vendor-driven) rather than driven by actual database user requirements. Well, over time, it does move in the right direction.
There is no such limitation in the SQL standard for LATERAL
. CROSS APPLY
is a vendor-specific extension from Microsoft (Oracle adopted it later for compatibility) and its limitations are obviously not owed to the ISO/IEC SQL standard, since the MS feature pre-dates the standard.
LATERAL
in standard SQL is basically just a modifier for joins to allow lateral references in the join tree. There is no limit to the number of columns that can be referenced.
I wouldn't see a reason for the odd restriction to begin with. Maybe it's because CROSS APPLY
was originally intended to allow table-valued functions, which was later extended to allow sub-SELECT
s.
The Postgres manual explains LATERAL
like this:
The
LATERAL
key word can precede a sub-SELECT FROM
item. This allows the sub-SELECT
to refer to columns ofFROM
items that appear before it in theFROM
list. (WithoutLATERAL
, each sub-SELECT
is evaluated independently and so cannot cross-reference any otherFROM
item.)
The Postgres version of your query (without the more elegant window functions) can be simpler:
SELECT c.* , round(revenue / c2.sum_total, 2) END AS percentage , round(c2.running_sum / c2.sum_total, 2) END AS running_percentage FROM t c, LATERAL ( SELECT NULLIF(SUM(revenue), 0)::numeric AS sum_total -- NULLIF, cast once , SUM(revenue) FILTER (WHERE revenue >= c.revenue) AS running_sum FROM t WHERE "User" = c."User" ) c2ORDER BY c."User", c.revenue DESC;
Postgres 9.4+ has the more elegant aggregate
FILTER
for conditional aggregates.NULLIF
makes sense, I only suggest a minor simplification.Cast
sum_total
tonumeric
once.Round result to match your desired result.