| Bug #22367 | Optimizer uses ref join type instead of eq_ref for simple join on strings | ||
|---|---|---|---|
| Submitted: | 14 Sep 2006 20:52 | Modified: | 26 Oct 2006 3:31 | 
| Reporter: | jocelyn fournier (Silver Quality Contributor) | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) | 
| Version: | 5.0.24a, 5.1 BK | OS: | Linux (Linux) | 
| Assigned to: | Georgi Kodinov | CPU Architecture: | Any | 
   [14 Sep 2006 20:52]
   jocelyn fournier        
  
 
   [15 Sep 2006 12:59]
   Sveta Smirnova        
  Thank you for the report. According to http://dev.mysql.com/doc/refman/5.0/en/explain.html and quotes from it: ----<START QUOTE>---- eq_ref ... It is used when all parts of an index are used by the join and the index is a PRIMARY KEY or UNIQUE index. ... ref ... ref is used if the join uses only a leftmost prefix of the key or if the key is not a PRIMARY KEY or UNIQUE index ----<END QUOTE>---- I mark this bug as "Verified". MySQL 4.1 is not affected: <skip filling tables> mysql> EXPLAIN SELECT a.id FROM a LEFT JOIN b ON b.login=a.createdby WHERE -> a.id=3; +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ | 1 | SIMPLE | a | const | PRIMARY | PRIMARY | 4 | const | 1 | | | 1 | SIMPLE | b | const | login | login | 20 | const | 1 | Using index | +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ 2 rows in set (0.00 sec)
   [27 Sep 2006 8:04]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/12590 ChangeSet@1.2280, 2006-09-27 11:04:07+03:00, gkodinov@macbook.gmz +6 -0 Bug #22367: Optimizer uses ref join type instead of eq_ref for simple join on strings MySQL is setting the flag HA_END_SPACE_KEYS for all the keys that reference text or varchar columns with collation different than binary. This was done to handle correctly the situation where a lookup on such a key may return more than 1 row because of the presence of many rows that differ only by the amount of trailing space in the table's string column. Inserting such values however appears to violate the unique checks on INSERT/UPDATE. Thus that flag must not be set as it will prevent the optimizer from choosing a faster access method. This fix removes the setting of the HA_END_SPACE_KEYS flag.
   [27 Sep 2006 14:13]
   jocelyn fournier        
  Hi, -Note 1003 select `test`.`t8`.`pseudo` AS `pseudo`,(select `test`.`t8`.`email` AS `email` from `test`.`t8` where (`test`.`t8`.`pseudo` = (select `test`.`t8`.`pseudo` AS `pseudo` from `test`.`t8` where (`test`.`t8`.`pseudo` = _latin1'joce')))) AS `(SELECT email FROM t8 WHERE pseudo=(SELECT pseudo FROM t8 WHERE pseudo='joce'))` from `test`.`t8` where (`test`.`t8`.`pseudo` = (select `test`.`t8`.`pseudo` AS `pseudo` from `test`.`t8` where (`test`.`t8`.`pseudo` = _latin1'joce'))) +Note 1003 select `test`.`t8`.`pseudo` AS `pseudo`,(select `test`.`t8`.`email` AS `email` from `test`.`t8` where 1) AS `(SELECT email FROM t8 WHERE pseudo=(SELECT pseudo FROM t8 WHERE pseudo='joce'))` from `test`.`t8` where 1 => Any idea why the EXTENDED SELECT returns WHERE 1 whereas before it was returning properly the WHERE part of the query ? Regards, Jocelyn
   [5 Oct 2006 16:12]
   Sergei Golubchik        
  Yes (though it's just an idea, as you requested, I didn't check the code to verify it).
If you look at the output of the EXPLAIN - the table itself, not the warning - you'll see that optimizer was using 'ref' join, and after the fix it was changed to 'const'. The manual says:
"
   * `const'
     The table has at most one matching row, which is read at the
     start of the query. Because there is only one row, values
     from the column in this row can be regarded as constants by
     the rest of the optimizer.  `const' tables are very fast
     because they are read only once.
"
but actually there's more to it, for 'const' tables, MySQL read the matching row very early in the execution, in fact, it reads it during optimizing, many optimizations are done _after_ this one row is read. MySQL reads this row and basically substitutes column references with plain constants - values from this row. In the remaining of optimizations these values are treated exactly like constants (e.g. they're subject to constant propagation).
The above was a fact, now goes an idea.
I suspect that after reading const tables and replacing appropriate columns with constants WHERE clause is optimized away. the warning is generated from the transformed query after WHERE is optimized away, so you see no WHERE clause. On the other hand, when prining an expression tree every column prints its original name, even if it was transformed to a constant. So, the printed query looks illogical.
 
   [5 Oct 2006 16:35]
   jocelyn fournier        
  Hi Sergei, Thanks for the explanation ! Do you want me to open an other bug report for this issue ? Thanks, Jocelyn
   [6 Oct 2006 16:57]
   Sergei Golubchik        
  No need to, I suspect it's the same as Bug#22331 (and Bug#21939 is also related)
   [16 Oct 2006 15:11]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13749 ChangeSet@1.2280, 2006-10-16 18:09:58+03:00, gkodinov@macbook.gmz +6 -0 Bug #22367: Optimizer uses ref join type instead of eq_ref for simple join on strings MySQL is setting the flag HA_END_SPACE_KEYS for all the keys that reference text or varchar columns with collation different than binary. This was done to handle correctly the situation where a lookup on such a key may return more than 1 row because of the presence of many rows that differ only by the amount of trailing space in the table's string column. Inserting such values however appears to violate the unique checks on INSERT/UPDATE. Thus that flag must not be set as it will prevent the optimizer from choosing a faster access method. This fix removes the setting of the HA_END_SPACE_KEYS flag.
   [21 Oct 2006 9:13]
   Georgi Kodinov        
  Pushed in 5.0.27/5.1.13-beta
   [26 Oct 2006 3:31]
   Paul DuBois        
  Noted in 5.0.30, 5.1.13 changelogs.

