Bug #41350 | String expression collation determined only by the first equality operand | ||
---|---|---|---|
Submitted: | 10 Dec 2008 8:00 | Modified: | 10 Dec 2008 8:30 |
Reporter: | Ilan Hazan | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.0.46-log | OS: | Linux |
Assigned to: | CPU Architecture: | Any | |
Tags: | casting, type conversion |
[10 Dec 2008 8:00]
Ilan Hazan
[10 Dec 2008 8:30]
Valeriy Kravchuk
Thank you for a problem report. I do not see anything in the manual (including http://dev.mysql.com/doc/refman/5.0/en/charset-collate-tricky.html) that explains how collation is determined in cases like this (utf8_*_ci vs. utf8_bin), so this is at least a valid documentation request.
[19 Dec 2008 7:07]
Alexander Barkov
A full script repeating the same problem: DROP TABLE IF EXISTS t1, t2; CREATE TABLE t1 ( entry_id int(11) NOT NULL auto_increment PRIMARY KEY, outsource_key varchar(255) character set utf8 collate utf8_bin NOT NULL, UNIQUE KEY outsource_key (outsource_key) ) ENGINE=MyISAM DEFAULT CHARSET=utf8; INSERT INTO t1 VALUES (1,'key1'),(2,'key2'),(10006,'key10006'); CREATE TABLE t2 ( p_id char(10) NOT NULL PRIMARY KEY, NAME char(89) default NULL, KEY NAME (NAME) ) ENGINE=MyISAM DEFAULT CHARSET=utf8; INSERT INTO t2 VALUES ('key1','name1'),('key2','name2'),('key10006','name10006'); EXPLAIN SELECT t1.entry_id, t1.outsource_key, t2.* FROM t1 JOIN t2 WHERE (t1.outsource_key = t2.p_id ) AND t1.entry_id =10006; EXPLAIN SELECT t1.entry_id, t1.outsource_key, t2.* FROM t1 JOIN t2 WHERE (t2.p_id = t1.outsource_key ) AND t1.entry_id =10006; And the output is: +----+-------------+-------+-------+-----------------------+---------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+-----------------------+---------+---------+-------+------+-------------+ | 1 | SIMPLE | t1 | const | PRIMARY,outsource_key | PRIMARY | 4 | const | 1 | | | 1 | SIMPLE | t2 | ALL | PRIMARY | NULL | NULL | NULL | 3 | Using where | +----+-------------+-------+-------+-----------------------+---------+---------+-------+------+-------------+ +----+-------------+-------+-------+-----------------------+---------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+-----------------------+---------+---------+-------+------+-------------+ | 1 | SIMPLE | t1 | const | PRIMARY,outsource_key | PRIMARY | 4 | const | 1 | | | 1 | SIMPLE | t2 | range | PRIMARY | PRIMARY | 30 | NULL | 1 | Using where | +----+-------------+-------+-------+-----------------------+---------+---------+-------+------+-------------+
[19 Dec 2008 7:38]
Alexander Barkov
The order of arguments does not affect the way how the collation for the operation is determined. It is always utf8_bin in this case. I think there are two problems here: 1. This part seems to be missing in the manual: When mixing a _ci collation and a _bin collation, and coercibility of the operands are the same, the _bin collation wins. 2. Possibly it is a bug in the optimizer. The plan depends on the order of the arguments. The plan for the first query is not optimal. I think that in both cases the optimizer should choose the second plan.
[5 Jan 2009 19:44]
Paul DuBois
I have added Bar's information about mixing _bin/_ci collations to the tricky-collation section.
[13 Jan 2009 16:15]
Paul DuBois
I am unassigning myself from this report because it has been recategorized as an optimizer report.
[26 Feb 2009 22:09]
Omer Barnir
triage: setting tag to CHECKED (3144) - triaging as an optimization bug