The SQL OVER() clause - when and why is it useful? The SQL OVER() clause - when and why is it useful? sql-server sql-server

The SQL OVER() clause - when and why is it useful?


You can use GROUP BY SalesOrderID. The difference is, with GROUP BY you can only have the aggregated values for the columns that are not included in GROUP BY.

In contrast, using windowed aggregate functions instead of GROUP BY, you can retrieve both aggregated and non-aggregated values. That is, although you are not doing that in your example query, you could retrieve both individual OrderQty values and their sums, counts, averages etc. over groups of same SalesOrderIDs.

Here's a practical example of why windowed aggregates are great. Suppose you need to calculate what percent of a total every value is. Without windowed aggregates you'd have to first derive a list of aggregated values and then join it back to the original rowset, i.e. like this:

SELECT  orig.[Partition],  orig.Value,  orig.Value * 100.0 / agg.TotalValue AS ValuePercentFROM OriginalRowset orig  INNER JOIN (    SELECT      [Partition],      SUM(Value) AS TotalValue    FROM OriginalRowset    GROUP BY [Partition]  ) agg ON orig.[Partition] = agg.[Partition]

Now look how you can do the same with a windowed aggregate:

SELECT  [Partition],  Value,  Value * 100.0 / SUM(Value) OVER (PARTITION BY [Partition]) AS ValuePercentFROM OriginalRowset orig

Much easier and cleaner, isn't it?


The OVER clause is powerful in that you can have aggregates over different ranges ("windowing"), whether you use a GROUP BY or not

Example: get count per SalesOrderID and count of all

SELECT    SalesOrderID, ProductID, OrderQty    ,COUNT(OrderQty) AS 'Count'    ,COUNT(*) OVER () AS 'CountAll'FROM Sales.SalesOrderDetail WHERE     SalesOrderID IN(43659,43664)GROUP BY     SalesOrderID, ProductID, OrderQty

Get different COUNTs, no GROUP BY

SELECT    SalesOrderID, ProductID, OrderQty    ,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'CountQtyPerOrder'    ,COUNT(OrderQty) OVER(PARTITION BY ProductID) AS 'CountQtyPerProduct',    ,COUNT(*) OVER () AS 'CountAllAgain'FROM Sales.SalesOrderDetail WHERE     SalesOrderID IN(43659,43664)


If you only wanted to GROUP BY the SalesOrderID then you wouldn't be able to include the ProductID and OrderQty columns in the SELECT clause.

The PARTITION BY clause let's you break up your aggregate functions. One obvious and useful example would be if you wanted to generate line numbers for order lines on an order:

SELECT    O.order_id,    O.order_date,    ROW_NUMBER() OVER(PARTITION BY O.order_id) AS line_item_no,    OL.product_idFROM    Orders OINNER JOIN Order_Lines OL ON OL.order_id = O.order_id

(My syntax might be off slightly)

You would then get back something like:

order_id    order_date    line_item_no    product_id--------    ----------    ------------    ----------    1       2011-05-02         1              5    1       2011-05-02         2              4    1       2011-05-02         3              7    2       2011-05-12         1              8    2       2011-05-12         2              1