Optimize SQL that uses between clause Optimize SQL that uses between clause sql sql

Optimize SQL that uses between clause


You may want to try something like this

Select A.ID,(SELECT B.ID FROM BWHERE A.EventTime BETWEEN B.start_time AND B.end_time LIMIT 1) AS B_IDFROM A

If you have an index on the Start_Time,End_Time fields for B, then this should work quite well.


I'm not sure this can be optimized fully. I tried it on MySQL 5.1.30. I also added an index on {B.start_time, B.end_time} as suggested by other folks. Then I got a report from EXPLAIN, but the best I could get is a Range Access Method:

EXPLAIN SELECT A.id, B.id FROM A JOIN B ON A.event_time BETWEEN B.start_time AND B.end_time;+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------------------+| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                          |+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------------------+|  1 | SIMPLE      | A     | ALL  | event_time    | NULL | NULL    | NULL |    8 |                                                | |  1 | SIMPLE      | B     | ALL  | start_time    | NULL | NULL    | NULL |   96 | Range checked for each record (index map: 0x4) | +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------------------+

See the note on the far right. The optimizer thinks it might be able to use the index on {B.start_time, B.end_time} but it ended up deciding not to use that index. Your results may vary, because your data distribution is more representative.

Compare with the index usage if you compare A.event_time to a constant range:

EXPLAIN SELECT A.id FROM AWHERE A.event_time BETWEEN '2009-02-17 09:00' and '2009-02-17 10:00';+----+-------------+-------+-------+---------------+------------+---------+------+------+-------------+| id | select_type | table | type  | possible_keys | key        | key_len | ref  | rows | Extra       |+----+-------------+-------+-------+---------------+------------+---------+------+------+-------------+|  1 | SIMPLE      | A     | range | event_time    | event_time | 8       | NULL |    1 | Using where | +----+-------------+-------+-------+---------------+------------+---------+------+------+-------------+

And compare with the dependent sub-query form given by @Luke and @Kibbee, which seems to make use of indexes more effectively:

EXPLAIN SELECT A.id AS id_from_a,    (        SELECT B.id        FROM B        WHERE A.id BETWEEN B.start_time AND B.end_time        LIMIT 0, 1    ) AS id_from_bFROM A;+----+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+| id | select_type        | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |+----+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+|  1 | PRIMARY            | A     | index | NULL          | PRIMARY | 8       | NULL |    8 | Using index | |  2 | DEPENDENT SUBQUERY | B     | ALL   | start_time    | NULL    | NULL    | NULL |  384 | Using where | +----+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+

Weirdly, EXPLAIN lists possible_keys as NULL (i.e. no indexes could be used) but then decides to use the primary key after all. Could be an idiosyncrasy of MySQL's EXPLAIN report?


I wouldn't normally recommend a query like this, but...

Since you've specified that table A only has about 980 rows and that each row maps to exactly one row in table B, then you could do the following and it will most likely be a lot faster than a cartesian join:

SELECT A.id AS id_from_a,    (        SELECT B.id        FROM B        WHERE A.event_time BETWEEN B.start_time AND B.end_time        LIMIT 0, 1    ) AS id_from_bFROM A