When should I use CROSS APPLY over INNER JOIN?
Can anyone give me a good example of when CROSS APPLY makes a difference in those cases where INNER JOIN will work as well?
See the article in my blog for detailed performance comparison:
CROSS APPLY
works better on things that have no simple JOIN
condition.
This one selects 3
last records from t2
for each record from t1
:
SELECT t1.*, t2o.*FROM t1CROSS APPLY ( SELECT TOP 3 * FROM t2 WHERE t2.t1_id = t1.id ORDER BY t2.rank DESC ) t2o
It cannot be easily formulated with an INNER JOIN
condition.
You could probably do something like that using CTE
's and window function:
WITH t2o AS ( SELECT t2.*, ROW_NUMBER() OVER (PARTITION BY t1_id ORDER BY rank) AS rn FROM t2 )SELECT t1.*, t2o.*FROM t1INNER JOIN t2oON t2o.t1_id = t1.id AND t2o.rn <= 3
, but this is less readable and probably less efficient.
Update:
Just checked.
master
is a table of about 20,000,000
records with a PRIMARY KEY
on id
.
This query:
WITH q AS ( SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS rn FROM master ), t AS ( SELECT 1 AS id UNION ALL SELECT 2 )SELECT *FROM tJOIN qON q.rn <= t.id
runs for almost 30
seconds, while this one:
WITH t AS ( SELECT 1 AS id UNION ALL SELECT 2 )SELECT *FROM tCROSS APPLY ( SELECT TOP (t.id) m.* FROM master m ORDER BY id ) q
is instant.
Consider you have two tables.
MASTER TABLE
x------x--------------------x| Id | Name |x------x--------------------x| 1 | A || 2 | B || 3 | C |x------x--------------------x
DETAILS TABLE
x------x--------------------x-------x| Id | PERIOD | QTY |x------x--------------------x-------x| 1 | 2014-01-13 | 10 || 1 | 2014-01-11 | 15 || 1 | 2014-01-12 | 20 || 2 | 2014-01-06 | 30 || 2 | 2014-01-08 | 40 |x------x--------------------x-------x
There are many situations where we need to replace INNER JOIN
with CROSS APPLY
.
1. Join two tables based on TOP n
results
Consider if we need to select Id
and Name
from Master
and last two dates for each Id
from Details table
.
SELECT M.ID,M.NAME,D.PERIOD,D.QTYFROM MASTER MINNER JOIN( SELECT TOP 2 ID, PERIOD,QTY FROM DETAILS D ORDER BY CAST(PERIOD AS DATE)DESC)DON M.ID=D.ID
The above query generates the following result.
x------x---------x--------------x-------x| Id | Name | PERIOD | QTY |x------x---------x--------------x-------x| 1 | A | 2014-01-13 | 10 || 1 | A | 2014-01-12 | 20 |x------x---------x--------------x-------x
See, it generated results for last two dates with last two date's Id
and then joined these records only in the outer query on Id
, which is wrong. This should be returning both Ids
1 and 2 but it returned only 1 because 1 has the last two dates. To accomplish this, we need to use CROSS APPLY
.
SELECT M.ID,M.NAME,D.PERIOD,D.QTYFROM MASTER MCROSS APPLY( SELECT TOP 2 ID, PERIOD,QTY FROM DETAILS D WHERE M.ID=D.ID ORDER BY CAST(PERIOD AS DATE)DESC)D
and forms the following result.
x------x---------x--------------x-------x| Id | Name | PERIOD | QTY |x------x---------x--------------x-------x| 1 | A | 2014-01-13 | 10 || 1 | A | 2014-01-12 | 20 || 2 | B | 2014-01-08 | 40 || 2 | B | 2014-01-06 | 30 |x------x---------x--------------x-------x
Here's how it works. The query inside CROSS APPLY
can reference the outer table, where INNER JOIN
cannot do this (it throws compile error). When finding the last two dates, joining is done inside CROSS APPLY
i.e., WHERE M.ID=D.ID
.
2. When we need INNER JOIN
functionality using functions.
CROSS APPLY
can be used as a replacement with INNER JOIN
when we need to get result from Master
table and a function
.
SELECT M.ID,M.NAME,C.PERIOD,C.QTYFROM MASTER MCROSS APPLY dbo.FnGetQty(M.ID) C
And here is the function
CREATE FUNCTION FnGetQty ( @Id INT )RETURNS TABLE ASRETURN ( SELECT ID,PERIOD,QTY FROM DETAILS WHERE ID=@Id)
which generated the following result
x------x---------x--------------x-------x| Id | Name | PERIOD | QTY |x------x---------x--------------x-------x| 1 | A | 2014-01-13 | 10 || 1 | A | 2014-01-11 | 15 || 1 | A | 2014-01-12 | 20 || 2 | B | 2014-01-06 | 30 || 2 | B | 2014-01-08 | 40 |x------x---------x--------------x-------x
ADDITIONAL ADVANTAGE OF CROSS APPLY
APPLY
can be used as a replacement for UNPIVOT
. Either CROSS APPLY
or OUTER APPLY
can be used here, which are interchangeable.
Consider you have the below table(named MYTABLE
).
x------x-------------x--------------x| Id | FROMDATE | TODATE |x------x-------------x--------------x| 1 | 2014-01-11 | 2014-01-13 | | 1 | 2014-02-23 | 2014-02-27 | | 2 | 2014-05-06 | 2014-05-30 | | 3 | NULL | NULL |x------x-------------x--------------x
The query is below.
SELECT DISTINCT ID,DATESFROM MYTABLE CROSS APPLY(VALUES (FROMDATE),(TODATE))COLUMNNAMES(DATES)
which brings you the result
x------x-------------x | Id | DATES | x------x-------------x | 1 | 2014-01-11 | | 1 | 2014-01-13 | | 1 | 2014-02-23 | | 1 | 2014-02-27 | | 2 | 2014-05-06 | | 2 | 2014-05-30 | | 3 | NULL | x------x-------------x
cross apply
sometimes enables you to do things that you cannot do with inner join
.
Example (a syntax error):
select F.* from sys.objects O inner join dbo.myTableFun(O.name) F on F.schema_id= O.schema_id
This is a syntax error, because, when used with inner join
, table functions can only take variables or constants as parameters. (I.e., the table function parameter cannot depend on another table's column.)
However:
select F.* from sys.objects O cross apply ( select * from dbo.myTableFun(O.name) ) F where F.schema_id= O.schema_id
This is legal.
Edit:Or alternatively, shorter syntax: (by ErikE)
select F.* from sys.objects O cross apply dbo.myTableFun(O.name) Fwhere F.schema_id= O.schema_id
Edit:
Note:Informix 12.10 xC2+ has Lateral Derived Tables and Postgresql (9.3+) has Lateral Subqueries which can be used to a similar effect.