Mysql: inner join on primary key for 2 IDs gives "Range checked for each record"
In general, MySQL can use only one index per table reference in a query (there's an index-merge algorithm, but this doesn't work as often as you might think).
Your join condition has an OR
between two comparisons to indexed columns, and the optimizer can't choose which is the better one to use before the data in the table is examined row-by-row.
A common workaround is to do a UNION
between simpler queries, instead of the OR
condition.
mysql> EXPLAIN SELECT * FROM test_message AS m INNER JOIN test_user AS u ON u.id = m.sender_id UNION SELECT * FROM test_message AS m INNER JOIN test_user AS u ON u.id = m.receiver_id;+----+--------------+------------+--------+---------------+---------+---------+--------------------+------+-----------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+--------------+------------+--------+---------------+---------+---------+--------------------+------+-----------------+| 1 | PRIMARY | m | ALL | idx_sender | NULL | NULL | NULL | 1 | NULL || 1 | PRIMARY | u | eq_ref | PRIMARY | PRIMARY | 4 | test.m.sender_id | 1 | NULL || 2 | UNION | m | ALL | idx_receiver | NULL | NULL | NULL | 1 | NULL || 2 | UNION | u | eq_ref | PRIMARY | PRIMARY | 4 | test.m.receiver_id | 1 | NULL || NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | Using temporary |+----+--------------+------------+--------+---------------+---------+---------+--------------------+------+-----------------+
This does use proper index lookups in both sub-queries, but it has to use a temporary table to finish the UNION
afterwards. Ultimately, it might be a wash for performance. Depends on how many rows of data need to be examined, and how many rows are produced as the result.
The problem is well known also in other (I think all) RDBMS, the optimizer will use only one rule for each join.
If the join condition is complex or if it can't recognize a known pattern to solve it, no optimization will be applied and it will go for full table scan.
In your case the OR
condition in main join, seems simple but it is not, because you are asking to check each user id against two different columns (not constant values) at a time.
To solve it you must split your join condition in more sub-queries so optimizer can use the better rule for each one.
@Bill Karwin has proposed the common solution and it helps well to understand the problem.
A (slightly) better way to solve this problem is to move the union up one level and join on the derived table:
EXPLAIN SELECT *FROM test_user AS uINNER JOIN ( select id, sender_id as msg_id from test_message union all select id, receiver_id from test_message ) AS m ON u.id = m.msg_id;
It will not use TEMPORARY
tables and does only one full table scan on test_users
instead of two
id select_type table partitions type possible_keys key key_len ref rows filtered Extra1 PRIMARY u NULL ALL PRIMARY NULL NULL NULL 1 100.00 NULL1 PRIMARY <derived2> NULL ref <auto_key0> <auto_key0> 4 test.u.id 2 100.00 NULL2 DERIVED test_message NULL index NULL idx_sender 4 NULL 1 100.00 "Using index"3 UNION test_message NULL index NULL idx_receiver 4 NULL 1 100.00 "Using index"