UNION with WHERE clause
In my experience, Oracle is very good at pushing simple predicates around. The following test was made on Oracle 11.2. I'm fairly certain it produces the same execution plan on all releases of 10g as well.
(Please people, feel free to leave a comment if you run an earlier version and tried the following)
create table table1(a number, b number);create table table2(a number, b number);explain plan forselect * from (select a,b from table1 union select a,b from table2 ) where a > 1;select * from table(dbms_xplan.display(format=>'basic +predicate'));PLAN_TABLE_OUTPUT---------------------------------------| Id | Operation | Name |---------------------------------------| 0 | SELECT STATEMENT | || 1 | VIEW | || 2 | SORT UNIQUE | || 3 | UNION-ALL | ||* 4 | TABLE ACCESS FULL| TABLE1 ||* 5 | TABLE ACCESS FULL| TABLE2 |---------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 4 - filter("A">1) 5 - filter("A">1)
As you can see at steps (4,5), the predicate is pushed down and applied before the sort (union).
I couldn't get the optimizer to push down an entire sub query such as
where a = (select max(a) from empty_table)
or a join. With proper PK/FK constraints in place it might be possible, but clearly there are limitations :)
Just a caution
If you tried
SELECT colA, colB FROM tableA WHERE colA > 1UNIONSELECT colX, colA FROM tableB WHERE colA > 1
compared to:
SELECT * FROM (SELECT colA, colB FROM tableA UNION SELECT colX, colA FROM tableB) WHERE colA > 1
Then in the second query, the colA in the where clause will actually have the colX from tableB, making it a very different query. If columns are being aliased in this way, it can get confusing.
NOTE: While my advice was true many years ago, Oracle's optimizer has improved so that the location of the where definitely no longer matters here. However preferring UNION ALL
vs UNION
will always be true, and portable SQL should avoid depending on optimizations that may not be in all databases.
Short answer, you want the WHERE
before the UNION
and you want to use UNION ALL
if at all possible. If you are using UNION ALL
then check the EXPLAIN output, Oracle might be smart enough to optimize the WHERE
condition if it is left after.
The reason is the following. The definition of a UNION
says that if there are duplicates in the two data sets, they have to be removed. Therefore there is an implicit GROUP BY
in that operation, which tends to be slow. Worse yet, Oracle's optimizer (at least as of 3 years ago, and I don't think it has changed) doesn't try to push conditions through a GROUP BY
(implicit or explicit). Therefore Oracle has to construct larger data sets than necessary, group them, and only then gets to filter. Thus prefiltering wherever possible is officially a Good Idea. (This is, incidentally, why it is important to put conditions in the WHERE
whenever possible instead of leaving them in a HAVING
clause.)
Furthermore if you happen to know that there won't be duplicates between the two data sets, then use UNION ALL
. That is like UNION
in that it concatenates datasets, but it doesn't try to deduplicate data. This saves an expensive grouping operation. In my experience it is quite common to be able to take advantage of this operation.
Since UNION ALL
does not have an implicit GROUP BY
in it, it is possible that Oracle's optimizer knows how to push conditions through it. I don't have Oracle sitting around to test, so you will need to test that yourself.