Bug #70815 | SELECT DISTINCT hashes Unicode space characters incorrectly | ||
---|---|---|---|
Submitted: | 5 Nov 2013 3:43 | Modified: | 21 May 2014 19:00 |
Reporter: | Arthur O'Dwyer | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S2 (Serious) |
Version: | 5.5.31, 5.5.34 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[5 Nov 2013 3:43]
Arthur O'Dwyer
[5 Nov 2013 5:52]
MySQL Verification Team
related: http://bugs.mysql.com/bug.php?id=69869
[5 Nov 2013 12:13]
Sveta Smirnova
Thank you for the report. Verified as described. Workaround: use CHAR: mysql> create table t (v char(100) collate utf8_unicode_ci); Query OK, 0 rows affected (0.33 sec) mysql> insert into t values (_binary 0x41C2A0), (_binary 0x41); Query OK, 2 rows affected (0.19 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> select concat(v, 'xxx'), hex(v) from t where v=_binary 0x41; +------------------+--------+ | concat(v, 'xxx') | hex(v) | +------------------+--------+ | A xxx | 41C2A0 | | Axxx | 41 | +------------------+--------+ 2 rows in set (0.00 sec) mysql> select v from t where v=_binary 0x41; +------+ | v | +------+ | A | | A | +------+ 2 rows in set (0.00 sec) mysql> select distinct v from t; +------+ | v | +------+ | A | +------+ 1 row in set (0.08 sec) mysql> alter table t add unique index k1(v(10)); ERROR 1062 (23000): Duplicate entry 'A' for key 'k1'
[5 Nov 2013 16:03]
Sveta Smirnova
Actually correct SELECT ... WHERE query should be: mysql> select v from t where v=convert(_binary 0x41 using UTF8) collate utf8_unicode_ci; +------+ | v | +------+ | A | | A | +------+ 2 rows in set (0.00 sec) But result is still same.
[21 May 2014 19:00]
Arthur O'Dwyer
Queries with a "GROUP BY" are also affected by this bug; we can get two distinct groups in the output corresponding to equivalent inputs.