| Bug #22367 | Optimizer uses ref join type instead of eq_ref for simple join on strings | ||
|---|---|---|---|
| Submitted: | 14 Sep 2006 20:52 | Modified: | 26 Oct 2006 3:31 |
| Reporter: | jocelyn fournier (Silver Quality Contributor) | Email Updates: | |
| Status: | Closed | ||
| Category: | Server: Optimizer | Severity: | S3 (Non-critical) |
| Version: | 5.0.24a, 5.1 BK | OS: | Linux (Linux) |
| Assigned to: | Georgi Kodinov | Target Version: | |
[15 Sep 2006 12:59]
Sveta Smirnova
Thank you for the report. According to http://dev.mysql.com/doc/refman/5.0/en/explain.html and quotes from it: ----<START QUOTE>---- eq_ref ... It is used when all parts of an index are used by the join and the index is a PRIMARY KEY or UNIQUE index. ... ref ... ref is used if the join uses only a leftmost prefix of the key or if the key is not a PRIMARY KEY or UNIQUE index ----<END QUOTE>---- I mark this bug as "Verified". MySQL 4.1 is not affected: <skip filling tables> mysql> EXPLAIN SELECT a.id FROM a LEFT JOIN b ON b.login=a.createdby WHERE -> a.id=3; +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ | 1 | SIMPLE | a | const | PRIMARY | PRIMARY | 4 | const | 1 | | | 1 | SIMPLE | b | const | login | login | 20 | const | 1 | Using index | +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ 2 rows in set (0.00 sec)
[27 Sep 2006 8:04]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/12590 ChangeSet@1.2280, 2006-09-27 11:04:07+03:00, gkodinov@macbook.gmz +6 -0 Bug #22367: Optimizer uses ref join type instead of eq_ref for simple join on strings MySQL is setting the flag HA_END_SPACE_KEYS for all the keys that reference text or varchar columns with collation different than binary. This was done to handle correctly the situation where a lookup on such a key may return more than 1 row because of the presence of many rows that differ only by the amount of trailing space in the table's string column. Inserting such values however appears to violate the unique checks on INSERT/UPDATE. Thus that flag must not be set as it will prevent the optimizer from choosing a faster access method. This fix removes the setting of the HA_END_SPACE_KEYS flag.
[27 Sep 2006 14:13]
jocelyn fournier
Hi, -Note 1003 select `test`.`t8`.`pseudo` AS `pseudo`,(select `test`.`t8`.`email` AS `email` from `test`.`t8` where (`test`.`t8`.`pseudo` = (select `test`.`t8`.`pseudo` AS `pseudo` from `test`.`t8` where (`test`.`t8`.`pseudo` = _latin1'joce')))) AS `(SELECT email FROM t8 WHERE pseudo=(SELECT pseudo FROM t8 WHERE pseudo='joce'))` from `test`.`t8` where (`test`.`t8`.`pseudo` = (select `test`.`t8`.`pseudo` AS `pseudo` from `test`.`t8` where (`test`.`t8`.`pseudo` = _latin1'joce'))) +Note 1003 select `test`.`t8`.`pseudo` AS `pseudo`,(select `test`.`t8`.`email` AS `email` from `test`.`t8` where 1) AS `(SELECT email FROM t8 WHERE pseudo=(SELECT pseudo FROM t8 WHERE pseudo='joce'))` from `test`.`t8` where 1 => Any idea why the EXTENDED SELECT returns WHERE 1 whereas before it was returning properly the WHERE part of the query ? Regards, Jocelyn
[5 Oct 2006 16:12]
Sergei Golubchik
Yes (though it's just an idea, as you requested, I didn't check the code to verify it).
If you look at the output of the EXPLAIN - the table itself, not the warning - you'll see
that optimizer was using 'ref' join, and after the fix it was changed to 'const'. The
manual says:
"
* `const'
The table has at most one matching row, which is read at the
start of the query. Because there is only one row, values
from the column in this row can be regarded as constants by
the rest of the optimizer. `const' tables are very fast
because they are read only once.
"
but actually there's more to it, for 'const' tables, MySQL read the matching row very
early in the execution, in fact, it reads it during optimizing, many optimizations are
done _after_ this one row is read. MySQL reads this row and basically substitutes column
references with plain constants - values from this row. In the remaining of optimizations
these values are treated exactly like constants (e.g. they're subject to constant
propagation).
The above was a fact, now goes an idea.
I suspect that after reading const tables and replacing appropriate columns with
constants WHERE clause is optimized away. the warning is generated from the transformed
query after WHERE is optimized away, so you see no WHERE clause. On the other hand, when
prining an expression tree every column prints its original name, even if it was
transformed to a constant. So, the printed query looks illogical.
[5 Oct 2006 16:35]
jocelyn fournier
Hi Sergei, Thanks for the explanation ! Do you want me to open an other bug report for this issue ? Thanks, Jocelyn
[6 Oct 2006 16:57]
Sergei Golubchik
No need to, I suspect it's the same as Bug#22331 (and Bug#21939 is also related)
[16 Oct 2006 15:11]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13749 ChangeSet@1.2280, 2006-10-16 18:09:58+03:00, gkodinov@macbook.gmz +6 -0 Bug #22367: Optimizer uses ref join type instead of eq_ref for simple join on strings MySQL is setting the flag HA_END_SPACE_KEYS for all the keys that reference text or varchar columns with collation different than binary. This was done to handle correctly the situation where a lookup on such a key may return more than 1 row because of the presence of many rows that differ only by the amount of trailing space in the table's string column. Inserting such values however appears to violate the unique checks on INSERT/UPDATE. Thus that flag must not be set as it will prevent the optimizer from choosing a faster access method. This fix removes the setting of the HA_END_SPACE_KEYS flag.
[21 Oct 2006 9:13]
Georgi Kodinov
Pushed in 5.0.27/5.1.13-beta
[26 Oct 2006 3:31]
Paul DuBois
Noted in 5.0.30, 5.1.13 changelogs.

Description: Hi, When doing a simple join with strings, I figured out MySQL was using ref join type instead of eq_ref. Is this expected ? How to repeat: CREATE TABLE a (`id` int(10) unsigned NOT NULL auto_increment,`createdby` varchar(20) NOT NULL default '',PRIMARY KEY (`id`)); CREATE TABLE b (`id` int(10) unsigned NOT NULL auto_increment,`login` varchar(20) NOT NULL default '',PRIMARY KEY (`id`),UNIQUE KEY (login)); INSERT INTO a VALUES (1,'a'),(2,'b'),(3,'c'); INSERT INTO b VALUES (1,'a'),(2,'b'),(3,'c'); EXPLAIN SELECT a.id FROM a LEFT JOIN b ON b.login=a.createdby WHERE a.id=3; +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ | 1 | SIMPLE | a | const | PRIMARY | PRIMARY | 4 | const | 1 | | | 1 | SIMPLE | b | ref | login | login | 22 | const | 1 | Using index | +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ 2 rows in set (0.00 sec) I would have expect MySQL to use an eq_ref optimisation for table b since it has a unique key on 'login'. BTW, using EXPLAIN EXTENDED, it reports something strange : mysql> SHOW WARNINGS; +-------+------+-------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+-------------------------------------------------------------------------------------------------------------------------------+ | Note | 1003 | select `test`.`a`.`id` AS `id` from `test`.`a` left join `test`.`b` on((`test`.`b`.`login` = `test`.`a`.`createdby`)) where 1 | +-------+------+-------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) Why do we have WHERE 1 here, instead of where `test`.`a`.`id` = 3 ? Regards, Jocelyn