| Bug #20536 | md5() with GROUP BY and UCS2 return different results on myisam/innodb | ||
|---|---|---|---|
| Submitted: | 19 Jun 2006 15:36 | Modified: | 25 Sep 2006 17:00 |
| Reporter: | Shane Bester | ||
| Status: | Closed | ||
| Category: | Server: Charsets | Severity: | S3 (Non-critical) |
| Version: | 5.0.23BK, 4.1 | OS: | Any (*) |
| Assigned to: | Tim Smith | Target Version: | |
[27 Jun 2006 23:22]
Tim Smith
This bug is not MyISAM vs. InnoDB; the test case happens to depend on that. For example,
add ORDER BY name to the select statement, and the InnoDB query no longer displays the
problem; remove the GROUP BY, and the problem shows up with MyISAM.
I'm not sure yet why it sometimes shows up, and sometimes doesn't. The following patch
fixes the problem, but there is more work required to understand what other functions may
be affected by this same source problem.
===== item_strfunc.cc 1.271 vs edited =====
--- 1.271/sql/item_strfunc.cc 2006-06-26 16:50:50 -04:00
+++ edited/item_strfunc.cc 2006-06-27 16:58:35 -04:00
@@ -145,6 +145,7 @@
digest[8], digest[9], digest[10], digest[11],
digest[12], digest[13], digest[14], digest[15]);
str->length((uint) 32);
+ str->set_charset(&my_charset_bin);
return str;
}
null_value=1;
Need to investigate SHA1() and AES_ENCRYPT() as well.
[27 Jun 2006 23:33]
Tim Smith
With the InnoDB test table, the following expose the bug: select md5(name) from t1; select md5(name) from t1 order by id; select md5(name) from t1 order by name; These two fail on the InnoDB table, but produce correct results on the MyISAM table: select md5(name) from t1 group by id order by id; select md5(name) from t1 group by id; And the following return correct results on both tables: select md5(name) from t1 group by id order by name; select md5(name) from t1 group by name; select md5(name) from t1 group by name order by id;
[14 Jul 2006 4:51]
Tim Smith
The difference in the queries is that those which return correct results are 'Using
temporary'. There is a longstanding problem, where the character set information is
thrown away, for values processed via a temporary table. In this case, it threw away the
bogus character set, and so the correct values (in the 'binary' character set) were
returned.
The correct fix for this particular bug is to force the result to be 'binary', no matter
what the character set of the argument.
The same bug shows up in at least SHA1(), PASSWORD(), OLD_PASSWORD, ENCRYPT(), MAKE_SET(),
EXPORT_SET(), QUOTE().
I'm looking for the correct way to fix it in those.
create table bug20536 (id bigint not null auto_increment primary key, name varchar(255)
character set ucs2 not null);
insert into `bug20536` (`id`,`name`) values (1,'test1'), (2,'test2');
select md5(name) from bug20536;
select sha1(name) from bug20536;
select password(name) from bug20536;
select old_password(name) from bug20536;
select encrypt(name) from bug20536;
select make_set(3, name, name) from bug20536;
select export_set(5, name, name) from bug20536;
select quote(name) from bug20536;
drop table bug20536;
Current results are:
mysql> select md5(name) from bug20536;
+----------------------------------+
| md5(name) |
+----------------------------------+
| f4b7ce8b45a20e3c4e84bef515d1525c |
| ???????????????? |
+----------------------------------+
2 rows in set (0.00 sec)
mysql> select sha1(name) from bug20536;
+------------------------------------------+
| sha1(name) |
+------------------------------------------+
| e0b52f38deddb9f9e8d5336b153592794cb49baf |
| ???????????????????? |
+------------------------------------------+
2 rows in set (0.00 sec)
mysql> select password(name) from bug20536;
+----------------------+
| password(name) |
+----------------------+
| ???????????????????? |
| ???????????????????? |
+----------------------+
2 rows in set (0.00 sec)
mysql> select old_password(name) from bug20536;
+--------------------+
| old_password(name) |
+--------------------+
| ???????? |
| ???????? |
+--------------------+
2 rows in set (0.00 sec)
mysql> select encrypt(name) from bug20536;
+---------------+
| encrypt(name) |
+---------------+
| ?????? |
| ?????? |
+---------------+
2 rows in set (0.01 sec)
mysql> select make_set(3, name, name) from bug20536;
+-------------------------+
| make_set(3, name, name) |
+-------------------------+
| test1????? |
| test2????? |
+-------------------------+
2 rows in set (0.00 sec)
mysql> select export_set(5, name, name) from bug20536;
+-----------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------+
| export_set(5, name, name)
|
+-----------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------+
|
test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????te
st1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test
1??????test1??????test1??????test1??????test1??????test1??????test1??????test1??????test1?
?????test1??????test1??????test1??????test1??????test1??????test1??????test1????? |
|
test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????te
st2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test
2??????test2??????test2??????test2??????test2??????test2??????test2??????test2??????test2?
?????test2??????test2??????test2??????test2??????test2??????test2??????test2????? |
+-----------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
mysql> select quote(name) from bug20536;
+-------------+
| quote(name) |
+-------------+
| ???????? |
| ???????? |
+-------------+
2 rows in set (0.00 sec)
mysql> drop table bug20536;
Query OK, 0 rows affected (0.00 sec)
[9 Aug 2006 13:56]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/10197 ChangeSet@1.2542, 2006-08-09 05:56:10-06:00, tsmith@siva.hindu.god +4 -0 Bug #20536: md5() with GROUP BY and UCS2 return different results on myisam/innodb Make the encryption functions MD5(), SHA1() and ENCRYPT() return binary results. Make MAKE_SET() and EXPORT_SET() use the correct character set for their default separator strings.
[12 Aug 2006 1:09]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/10324 ChangeSet@1.2535, 2006-08-11 17:09:19-06:00, tsmith@siva.hindu.god +4 -0 Bug #20536: md5() with GROUP BY and UCS2 return different results on myisam/innodb Make the encryption functions MD5(), SHA1() and ENCRYPT() return binary results. Make MAKE_SET() and EXPORT_SET() use the correct character set for their default separator strings.
[30 Aug 2006 17:58]
Tim Smith
queued to -maint trees (4.1, 5.0, 5.1)
[31 Aug 2006 13:39]
Magnus Svensson
Pushed to 5.0.25
[12 Sep 2006 22:03]
Paul DuBois
Noted in 5.0.25 changelog. Use of MD5() with a ucs2 character column combined with GROUP BY could return different results for MyISAM than for InnoDB. Resetting report to NDI pending merge into other trees.
[13 Sep 2006 10:21]
Tim Smith
Pushed to 5.1.12.
[25 Sep 2006 17:00]
Paul DuBois
Corrected the changeset comment: The MD5(), SHA1(), and ENCRYPT() functions should return a binary string, but the result sometimes was converted to the character set of the argument. MAKE_SET() and EXPORT_SET() now use the correct character set for their default separators, resulting in consisten tresult strings which can be coerced according to normal character set rules. Noted this also in the 4.1.22, 5.1.12 changelogs.

Description: md5() with GROUP BY andUCS2 return different results on myisam/innodb. The results should be the same no matter which engine is used. Surrounding md5 call with hex() makes the results identical. unhex/hex can be used as a workaround. See result from 5.0BK mysql> create table t1(id bigint not null auto_increment pr Query OK, 0 rows affected (0.03 sec) mysql> INSERT INTO `t1` (`id`,`name`) VALUES -> (1,'test1'), -> (2,'test2'), -> (3,'test3'), -> (4,'test4'), -> (5,'test5'); Query OK, 5 rows affected (0.00 sec) Records: 5 Duplicates: 0 Warnings: 0 mysql> mysql> select md5(name) as hash from t1 group by id; +----------------------------------+ | hash | +----------------------------------+ | f4b7ce8b45a20e3c4e84bef515d1525c | | ???????????????? | | ???????????????? | | ???????????????? | | ???????????????? | +----------------------------------+ 5 rows in set (0.03 sec) mysql> alter table t1 engine=myisam; Query OK, 5 rows affected (0.08 sec) Records: 5 Duplicates: 0 Warnings: 0 mysql> select md5(name) as hash from t1 group by id; +----------------------------------+ | hash | +----------------------------------+ | f4b7ce8b45a20e3c4e84bef515d1525c | | 7823e21140de44563e17ad930d5a4a31 | | 189c2715677870a83a06c32f85c8398f | | cc14ebacc96404db9d3cfe87a702613a | | a3b6e94c168c9525a92cad9b42d1dc56 | +----------------------------------+ 5 rows in set (0.02 sec) mysql> How to repeat: drop table if exists t1; create table t1(id bigint not null auto_increment primary key, name varchar(255) character set ucs2 NOT NULL)engine=innodb; INSERT INTO `t1` (`id`,`name`) VALUES (1,'test1'), (2,'test2'), (3,'test3'), (4,'test4'), (5,'test5'); select md5(name) as hash from t1 group by id; alter table t1 engine=myisam; select md5(name) as hash from t1 group by id; Suggested fix: not sure..