Mysql: inner join on primary key for 2 IDs gives "Range checked for each record" Mysql: inner join on primary key for 2 IDs gives "Range checked for each record" mysql mysql

Mysql: inner join on primary key for 2 IDs gives "Range checked for each record"


In general, MySQL can use only one index per table reference in a query (there's an index-merge algorithm, but this doesn't work as often as you might think).

Your join condition has an OR between two comparisons to indexed columns, and the optimizer can't choose which is the better one to use before the data in the table is examined row-by-row.

A common workaround is to do a UNION between simpler queries, instead of the OR condition.

mysql> EXPLAIN     SELECT * FROM test_message AS m     INNER JOIN test_user AS u ON u.id = m.sender_id   UNION    SELECT * FROM test_message AS m     INNER JOIN test_user AS u ON u.id = m.receiver_id;+----+--------------+------------+--------+---------------+---------+---------+--------------------+------+-----------------+| id | select_type  | table      | type   | possible_keys | key     | key_len | ref                | rows | Extra           |+----+--------------+------------+--------+---------------+---------+---------+--------------------+------+-----------------+|  1 | PRIMARY      | m          | ALL    | idx_sender    | NULL    | NULL    | NULL               |    1 | NULL            ||  1 | PRIMARY      | u          | eq_ref | PRIMARY       | PRIMARY | 4       | test.m.sender_id   |    1 | NULL            ||  2 | UNION        | m          | ALL    | idx_receiver  | NULL    | NULL    | NULL               |    1 | NULL            ||  2 | UNION        | u          | eq_ref | PRIMARY       | PRIMARY | 4       | test.m.receiver_id |    1 | NULL            || NULL | UNION RESULT | <union1,2> | ALL    | NULL          | NULL    | NULL    | NULL               | NULL | Using temporary |+----+--------------+------------+--------+---------------+---------+---------+--------------------+------+-----------------+

This does use proper index lookups in both sub-queries, but it has to use a temporary table to finish the UNION afterwards. Ultimately, it might be a wash for performance. Depends on how many rows of data need to be examined, and how many rows are produced as the result.


The problem is well known also in other (I think all) RDBMS, the optimizer will use only one rule for each join.

If the join condition is complex or if it can't recognize a known pattern to solve it, no optimization will be applied and it will go for full table scan.

In your case the OR condition in main join, seems simple but it is not, because you are asking to check each user id against two different columns (not constant values) at a time.

To solve it you must split your join condition in more sub-queries so optimizer can use the better rule for each one.

@Bill Karwin has proposed the common solution and it helps well to understand the problem.

A (slightly) better way to solve this problem is to move the union up one level and join on the derived table:

EXPLAIN SELECT *FROM test_user AS uINNER JOIN (    select id, sender_id as msg_id    from test_message     union all     select id, receiver_id     from test_message     ) AS m ON u.id = m.msg_id;

It will not use TEMPORARY tables and does only one full table scan on test_users instead of two

id  select_type table           partitions  type    possible_keys   key             key_len ref         rows    filtered    Extra1   PRIMARY     u               NULL        ALL     PRIMARY         NULL            NULL    NULL        1       100.00      NULL1   PRIMARY     <derived2>      NULL        ref     <auto_key0>     <auto_key0>     4       test.u.id   2       100.00      NULL2   DERIVED     test_message    NULL        index   NULL            idx_sender      4       NULL        1       100.00      "Using index"3   UNION       test_message    NULL        index   NULL            idx_receiver    4       NULL        1       100.00      "Using index"