| Bug #37244 | Character sets: short utf8_bin weight_string value | ||
|---|---|---|---|
| Submitted: | 6 Jun 2008 13:14 | Modified: | 1 Sep 2010 7:08 |
| Reporter: | Peter Gulutzan | Email Updates: | |
| Status: | Duplicate | Impact on me: | |
| Category: | MySQL Server: Charsets | Severity: | S3 (Non-critical) |
| Version: | 5.5, 5.6, 6.0.6-alpha-debug | OS: | Linux (SUSE 10 | 32-bit) |
| Assigned to: | Assigned Account | CPU Architecture: | Any |
[6 Jun 2008 16:03]
Sveta Smirnova
Thank you for the report. Verified as described.
[30 Aug 2010 12:48]
Alexander Barkov
The bug is reproducible with 5.6:
--------------
DROP TABLE IF EXISTS t1
--------------
Query OK, 0 rows affected (0.01 sec)
--------------
CREATE TABLE t1
(utf32 CHAR(1) CHARACTER SET utf32 COLLATE utf32_bin,
utf16 CHAR(1) CHARACTER SET utf16 COLLATE utf16_bin,
utf8mb4 CHAR(1) COLLATE utf8mb4_bin)
--------------
Query OK, 0 rows affected (0.03 sec)
--------------
INSERT INTO t1 VALUES (0x0000ff9d,0xff9d,0xefbe9d)
--------------
Query OK, 1 row affected (0.00 sec)
--------------
INSERT INTO t1 VALUES (0x00010384,0xd800df84,0xf0908e84)
--------------
Query OK, 1 row affected (0.02 sec)
--------------
INSERT INTO t1 VALUES (_utf32 0x385, _utf32 0x385, _utf32 0x385)
--------------
Query OK, 1 row affected (0.00 sec)
--------------
INSERT INTO t1 VALUES (_utf32 0x100000, _utf32 0x100000, _utf32 0x100000)
--------------
Query OK, 1 row affected (0.00 sec)
--------------
SELECT hex(utf16),hex(weight_string(utf16)) FROM t1 ORDER BY utf16
--------------
+------------+---------------------------+
| hex(utf16) | hex(weight_string(utf16)) |
+------------+---------------------------+
| DBC0DC00 | 0000 |
| D800DF84 | 0384 |
| 0385 | 0385 |
| FF9D | FF9D |
+------------+---------------------------+
4 rows in set (0.00 sec)
--------------
SELECT hex(utf32),hex(weight_string(utf32)) FROM t1 ORDER BY utf32
--------------
+------------+---------------------------+
| hex(utf32) | hex(weight_string(utf32)) |
+------------+---------------------------+
| 00100000 | 0000 |
| 00010384 | 0384 |
| 00000385 | 0385 |
| 0000FF9D | FF9D |
+------------+---------------------------+
4 rows in set (0.00 sec)
--------------
SELECT hex(utf8mb4), hex(weight_string(utf8mb4)) FROM t1 ORDER BY utf8mb4
--------------
+--------------+-----------------------------+
| hex(utf8mb4) | hex(weight_string(utf8mb4)) |
+--------------+-----------------------------+
| F4808080 | 0000 |
| F0908E84 | 0384 |
| CE85 | 0385 |
| EFBE9D | FF9D |
+--------------+-----------------------------+
4 rows in set (0.00 sec)
[30 Aug 2010 12:51]
Alexander Barkov
The bug is reproducible with 5.5:
Note: although the weight_string() function does not exist 5.5 yet,
this script demonstrates that ORDER BY returns results in a wrong order:
--------------
DROP TABLE IF EXISTS t1
--------------
Query OK, 0 rows affected (0.01 sec)
--------------
CREATE TABLE t1
(utf32 CHAR(1) CHARACTER SET utf32 COLLATE utf32_bin,
utf16 CHAR(1) CHARACTER SET utf16 COLLATE utf16_bin,
utf8mb4 CHAR(1) COLLATE utf8mb4_bin)
--------------
Query OK, 0 rows affected (0.01 sec)
--------------
INSERT INTO t1 VALUES (0x0000ff9d,0xff9d,0xefbe9d)
--------------
Query OK, 1 row affected (0.01 sec)
--------------
INSERT INTO t1 VALUES (0x00010384,0xd800df84,0xf0908e84)
--------------
Query OK, 1 row affected (0.00 sec)
--------------
INSERT INTO t1 VALUES (_utf32 0x385, _utf32 0x385, _utf32 0x385)
--------------
Query OK, 1 row affected (0.00 sec)
--------------
INSERT INTO t1 VALUES (_utf32 0x100000, _utf32 0x100000, _utf32 0x100000)
--------------
Query OK, 1 row affected (0.00 sec)
--------------
SELECT hex(utf16) FROM t1 ORDER BY utf16
--------------
+------------+
| hex(utf16) |
+------------+
| DBC0DC00 |
| D800DF84 |
| 0385 |
| FF9D |
+------------+
4 rows in set (0.01 sec)
--------------
SELECT hex(utf32) FROM t1 ORDER BY utf32
--------------
+------------+
| hex(utf32) |
+------------+
| 00100000 |
| 00010384 |
| 00000385 |
| 0000FF9D |
+------------+
4 rows in set (0.00 sec)
--------------
SELECT hex(utf8mb4) FROM t1 ORDER BY utf8mb4
--------------
+--------------+
| hex(utf8mb4) |
+--------------+
| F4808080 |
| F0908E84 |
| CE85 |
| EFBE9D |
+--------------+
4 rows in set (0.00 sec)
[1 Sep 2010 7:07]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/117292 3251 Alexander Barkov 2010-09-01 This changeset is only adding tests to cover: Bug#37244 Character sets: short utf8_bin weight_string value The bug was fixed earlier, by the patch for Bug#55980 Character sets: supplementary character _bin ordering is wrong
[1 Sep 2010 7:08]
Alexander Barkov
Duplicate for: Bug#55980 Character sets: supplementary character _bin ordering is wrong
[2 Oct 2010 18:13]
Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alexander.nozdrin@oracle.com-20101002180948-852x1cuv7c6i85ea) (version source revid:alexander.nozdrin@oracle.com-20101002180857-an32jpuwzemsp4f2) (merge vers: 5.6.1-m4) (pib:21)
[2 Oct 2010 18:15]
Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101002181053-6iotvl26uurcoryp) (version source revid:alexander.nozdrin@oracle.com-20101002180917-h0n62akupm3z20nt) (pib:21)

Description: I create a table with three columns: utf32 collate utf32_bin, utf16 collate utf16_bin, utf8 collate utf8_bin. I insert a supplementary character: 10384 UGARITIC LETTER DELTA. I select hex(weight_string() for each column. The utf8 and utf32 values are both '0384'. How to repeat: create table t (utf32 char(1) collate utf32_bin, utf16 char(1) collate utf16_bin, utf8 char(1) collate utf8_bin); insert into t values (0x0000ff9d,0xff9d,0xefbe9d); insert into t values (0x00010384,0xd800df84,0xf0908e84); select hex(utf16),hex(weight_string(utf16)) from t; select hex(utf32),hex(weight_string(utf32)) from t; select hex(utf8), hex(weight_string(utf8)) from t; Example run: mysql> create table t -> (utf32 char(1) collate utf32_bin, -> utf16 char(1) collate utf16_bin, -> utf8 char(1) collate utf8_bin); Query OK, 0 rows affected (0.09 sec) mysql> insert into t values (0x0000ff9d,0xff9d,0xefbe9d); Query OK, 1 row affected (0.00 sec) mysql> insert into t values (0x00010384,0xd800df84,0xf0908e84); Query OK, 1 row affected (0.00 sec) mysql> select hex(utf16),hex(weight_string(utf16)) from t; +------------+---------------------------+ | hex(utf16) | hex(weight_string(utf16)) | +------------+---------------------------+ | FF9D | FF9D | | D800DF84 | D800DF84 | +------------+---------------------------+ 2 rows in set (0.00 sec) mysql> select hex(utf32),hex(weight_string(utf32)) from t; +------------+---------------------------+ | hex(utf32) | hex(weight_string(utf32)) | +------------+---------------------------+ | 0000FF9D | FF9D | | 00010384 | 0384 | +------------+---------------------------+ 2 rows in set (0.00 sec) mysql> select hex(utf8), hex(weight_string(utf8)) from t; +-----------+--------------------------+ | hex(utf8) | hex(weight_string(utf8)) | +-----------+--------------------------+ | EFBE9D | FF9D | | F0908E84 | 0384 | +-----------+--------------------------+ 2 rows in set (0.00 sec)