Bug #20536 | md5() with GROUP BY and UCS2 return different results on myisam/innodb | ||
---|---|---|---|
Submitted: | 19 Jun 2006 13:36 | Modified: | 25 Sep 2006 15:00 |
Reporter: | Shane Bester (Platinum Quality Contributor) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S3 (Non-critical) |
Version: | 5.0.23BK, 4.1 | OS: | Any (*) |
Assigned to: | Timothy Smith | CPU Architecture: | Any |
[19 Jun 2006 13:36]
Shane Bester
[27 Jun 2006 21:22]
Timothy Smith
This bug is not MyISAM vs. InnoDB; the test case happens to depend on that. For example, add ORDER BY name to the select statement, and the InnoDB query no longer displays the problem; remove the GROUP BY, and the problem shows up with MyISAM. I'm not sure yet why it sometimes shows up, and sometimes doesn't. The following patch fixes the problem, but there is more work required to understand what other functions may be affected by this same source problem. ===== item_strfunc.cc 1.271 vs edited ===== --- 1.271/sql/item_strfunc.cc 2006-06-26 16:50:50 -04:00 +++ edited/item_strfunc.cc 2006-06-27 16:58:35 -04:00 @@ -145,6 +145,7 @@ digest[8], digest[9], digest[10], digest[11], digest[12], digest[13], digest[14], digest[15]); str->length((uint) 32); + str->set_charset(&my_charset_bin); return str; } null_value=1; Need to investigate SHA1() and AES_ENCRYPT() as well.
[27 Jun 2006 21:33]
Timothy Smith
With the InnoDB test table, the following expose the bug: select md5(name) from t1; select md5(name) from t1 order by id; select md5(name) from t1 order by name; These two fail on the InnoDB table, but produce correct results on the MyISAM table: select md5(name) from t1 group by id order by id; select md5(name) from t1 group by id; And the following return correct results on both tables: select md5(name) from t1 group by id order by name; select md5(name) from t1 group by name; select md5(name) from t1 group by name order by id;
[14 Jul 2006 2:51]
Timothy Smith
The difference in the queries is that those which return correct results are 'Using temporary'. There is a longstanding problem, where the character set information is thrown away, for values processed via a temporary table. In this case, it threw away the bogus character set, and so the correct values (in the 'binary' character set) were returned. The correct fix for this particular bug is to force the result to be 'binary', no matter what the character set of the argument. The same bug shows up in at least SHA1(), PASSWORD(), OLD_PASSWORD, ENCRYPT(), MAKE_SET(), EXPORT_SET(), QUOTE(). I'm looking for the correct way to fix it in those. create table bug20536 (id bigint not null auto_increment primary key, name varchar(255) character set ucs2 not null); insert into `bug20536` (`id`,`name`) values (1,'test1'), (2,'test2'); select md5(name) from bug20536; select sha1(name) from bug20536; select password(name) from bug20536; select old_password(name) from bug20536; select encrypt(name) from bug20536; select make_set(3, name, name) from bug20536; select export_set(5, name, name) from bug20536; select quote(name) from bug20536; drop table bug20536; Current results are: mysql> select md5(name) from bug20536; +----------------------------------+ | md5(name) | +----------------------------------+ | f4b7ce8b45a20e3c4e84bef515d1525c | | ???????????????? | +----------------------------------+ 2 rows in set (0.00 sec) mysql> select sha1(name) from bug20536; +------------------------------------------+ | sha1(name) | +------------------------------------------+ | e0b52f38deddb9f9e8d5336b153592794cb49baf | | ???????????????????? | +------------------------------------------+ 2 rows in set (0.00 sec) mysql> select password(name) from bug20536; +----------------------+ | password(name) | +----------------------+ | ???????????????????? | | ???????????????????? | +----------------------+ 2 rows in set (0.00 sec) mysql> select old_password(name) from bug20536; +--------------------+ | old_password(name) | +--------------------+ | ???????? | | ???????? | +--------------------+ 2 rows in set (0.00 sec) mysql> select encrypt(name) from bug20536; +---------------+ | encrypt(name) | +---------------+ | ?????? | | ?????? | +---------------+ 2 rows in set (0.01 sec) mysql> select make_set(3, name, name) from bug20536; +-------------------------+ | make_set(3, name, name) | +-------------------------+ | test1????? | | test2????? | +-------------------------+ 2 rows in set (0.00 sec) mysql> select export_set(5, name, name) from bug20536; +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | export_set(5, name, name) | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1????? | | test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2????? | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2 rows in set (0.00 sec) mysql> select quote(name) from bug20536; +-------------+ | quote(name) | +-------------+ | ???????? | | ???????? | +-------------+ 2 rows in set (0.00 sec) mysql> drop table bug20536; Query OK, 0 rows affected (0.00 sec)
[9 Aug 2006 11:56]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/10197 ChangeSet@1.2542, 2006-08-09 05:56:10-06:00, tsmith@siva.hindu.god +4 -0 Bug #20536: md5() with GROUP BY and UCS2 return different results on myisam/innodb Make the encryption functions MD5(), SHA1() and ENCRYPT() return binary results. Make MAKE_SET() and EXPORT_SET() use the correct character set for their default separator strings.
[11 Aug 2006 23:09]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/10324 ChangeSet@1.2535, 2006-08-11 17:09:19-06:00, tsmith@siva.hindu.god +4 -0 Bug #20536: md5() with GROUP BY and UCS2 return different results on myisam/innodb Make the encryption functions MD5(), SHA1() and ENCRYPT() return binary results. Make MAKE_SET() and EXPORT_SET() use the correct character set for their default separator strings.
[30 Aug 2006 15:58]
Timothy Smith
queued to -maint trees (4.1, 5.0, 5.1)
[31 Aug 2006 11:39]
Magnus BlÄudd
Pushed to 5.0.25
[12 Sep 2006 20:03]
Paul DuBois
Noted in 5.0.25 changelog. Use of MD5() with a ucs2 character column combined with GROUP BY could return different results for MyISAM than for InnoDB. Resetting report to NDI pending merge into other trees.
[13 Sep 2006 8:21]
Timothy Smith
Pushed to 5.1.12.
[25 Sep 2006 15:00]
Paul DuBois
Corrected the changeset comment: The MD5(), SHA1(), and ENCRYPT() functions should return a binary string, but the result sometimes was converted to the character set of the argument. MAKE_SET() and EXPORT_SET() now use the correct character set for their default separators, resulting in consisten tresult strings which can be coerced according to normal character set rules. Noted this also in the 4.1.22, 5.1.12 changelogs.