How to tune a range / interval query in Oracle?
Your attempt is good, but misses a few crucial issues.
Let's start slowly. I'm assuming an index on COL1
and I actually don't mind if COL2
is included there as well.
Due to the constraints you have on your data (especially non-overlapping) you actually just want the row before the row where COL1
is <=
some value....[--take a break--] it you order by COL1
This is a classic Top-N query:
select * FROM ( select * from A where col1 <= :some_value order by col1 desc ) where rownum <= 1;
Please note that you must use ORDER BY
to get a definite sort order. As WHERE
is applied after ORDER BY
you must now also wrap the top-n filter in an outer query.
That's almost done, the only reason why we actually need to filter on COL2
too is to filter out records that don't fall into the range at all. E.g. if some_value is 5 and you are having this data:
COL1 | COL2 1 | 2 3 | 4 <-- you get this row 6 | 10
This row would be correct as result, if COL2
would be 5, but unfortunately, in this case the correct result of your query is [empty set]. That's the only reason we need to filter for COL2
like this:
select * FROM ( select * FROM ( select * from A where col1 <= :some_value order by col1 desc ) where rownum <= 1 ) WHERE col2 >= :some_value;
Your approach had several problems:
- missing
ORDER BY
- dangerous in connection withrownum
filter! - applying the Top-N clause (
rownum
filter) too early. What if there is no result? Database reads index until the end, therownum
(STOPKEY) never kicks in. - An optimizer glitch. With the
between
predicate, my 11g installation doesn't come to the idea to read the index in descending order, so it was actually reading it from the beginning (0) upwards until it found aCOL2
value that matched --OR-- theCOL1
run out of the range.
.
COL1 | COL2 1 | 2 ^ 3 | 4 | (2) go up until first match. +----- your intention was to start here 6 | 10
What was actually happening was:
COL1 | COL2 1 | 2 +----- start at the beginning of the index 3 | 4 | Go down until first match. V 6 | 10
Look at the execution plan of my query:
------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 26 | 4 (0)| 00:00:01 ||* 1 | VIEW | | 1 | 26 | 4 (0)| 00:00:01 ||* 2 | COUNT STOPKEY | | | | | || 3 | VIEW | | 2 | 52 | 4 (0)| 00:00:01 || 4 | TABLE ACCESS BY INDEX ROWID | A | 50000 | 585K| 4 (0)| 00:00:01 ||* 5 | INDEX RANGE SCAN DESCENDING| SIMPLE | 2 | | 3 (0)| 00:00:01 |------------------------------------------------------------------------------------------
Note the INDEX RANGE SCAN **DESCENDING**
.
Finally, why didn't I include COL2
in the index? It's a single row-top-n query. You can save at most a single table access (irrespective of what the Rows estimation above says!) If you expect to find a row in most cases, you'll need to go to the table anyways for the other columns (probably) so you would not save ANYTHING, just consume space. Including the COL2
will only improve performance if you query doesn't return anything at all!
Related:
- How to use index efficienty in mysql query I answered a very similar question about this years ago. Same solution.
- Use The Index, Lukas!
I think, because the ranges do not intersect, you can define col1 as primary key and execute the query like this:
SELECT * FROM a JOIN (SELECT MAX (col1) AS col1 FROM a WHERE col1 <= :somevalue) b ON a.col1 = b.col1;
If there are gaps between the ranges you wil have to add:
Where col2 >= :somevalue
as last line.
Execution Plan:
SELECT STATEMENT NESTED LOOPS VIEW SORT AGGREGATE FIRST ROW INDEX RANGE SCAN (MIN/MAX) PKU1 TABLE ACCESS BY INDEX A INDEX UNIQUE SCAN PKU1