Optimize SQL that uses between clause
You may want to try something like this
Select A.ID,(SELECT B.ID FROM BWHERE A.EventTime BETWEEN B.start_time AND B.end_time LIMIT 1) AS B_IDFROM A
If you have an index on the Start_Time,End_Time fields for B, then this should work quite well.
I'm not sure this can be optimized fully. I tried it on MySQL 5.1.30. I also added an index on {B.start_time, B.end_time}
as suggested by other folks. Then I got a report from EXPLAIN
, but the best I could get is a Range Access Method:
EXPLAIN SELECT A.id, B.id FROM A JOIN B ON A.event_time BETWEEN B.start_time AND B.end_time;+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------------------+| 1 | SIMPLE | A | ALL | event_time | NULL | NULL | NULL | 8 | | | 1 | SIMPLE | B | ALL | start_time | NULL | NULL | NULL | 96 | Range checked for each record (index map: 0x4) | +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------------------+
See the note on the far right. The optimizer thinks it might be able to use the index on {B.start_time, B.end_time}
but it ended up deciding not to use that index. Your results may vary, because your data distribution is more representative.
Compare with the index usage if you compare A.event_time
to a constant range:
EXPLAIN SELECT A.id FROM AWHERE A.event_time BETWEEN '2009-02-17 09:00' and '2009-02-17 10:00';+----+-------------+-------+-------+---------------+------------+---------+------+------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+-------+---------------+------------+---------+------+------+-------------+| 1 | SIMPLE | A | range | event_time | event_time | 8 | NULL | 1 | Using where | +----+-------------+-------+-------+---------------+------------+---------+------+------+-------------+
And compare with the dependent sub-query form given by @Luke and @Kibbee, which seems to make use of indexes more effectively:
EXPLAIN SELECT A.id AS id_from_a, ( SELECT B.id FROM B WHERE A.id BETWEEN B.start_time AND B.end_time LIMIT 0, 1 ) AS id_from_bFROM A;+----+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+| 1 | PRIMARY | A | index | NULL | PRIMARY | 8 | NULL | 8 | Using index | | 2 | DEPENDENT SUBQUERY | B | ALL | start_time | NULL | NULL | NULL | 384 | Using where | +----+--------------------+-------+-------+---------------+---------+---------+------+------+-------------+
Weirdly, EXPLAIN lists possible_keys
as NULL (i.e. no indexes could be used) but then decides to use the primary key after all. Could be an idiosyncrasy of MySQL's EXPLAIN report?
I wouldn't normally recommend a query like this, but...
Since you've specified that table A only has about 980 rows and that each row maps to exactly one row in table B, then you could do the following and it will most likely be a lot faster than a cartesian join:
SELECT A.id AS id_from_a, ( SELECT B.id FROM B WHERE A.event_time BETWEEN B.start_time AND B.end_time LIMIT 0, 1 ) AS id_from_bFROM A