| Bug #39287 | Binary conversion of a multi-byte encoded string leads to wrong sort order | ||
|---|---|---|---|
| Submitted: | 6 Sep 2008 12:56 | Modified: | 18 Jan 2018 13:12 |
| Reporter: | Andreas Peer | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Charsets | Severity: | S3 (Non-critical) |
| Version: | 5.0.51a, 5.0.67, 5.1.26 | OS: | Linux (Ubuntu Hardy 8.04 AMD64) |
| Assigned to: | Assigned Account | CPU Architecture: | Any |
| Tags: | BINARY, multi-byte encoding, order by, sorting, ucs2 | ||
[7 Sep 2008 18:48]
Valeriy Kravchuk
Thank you for a problem report. Verified just as described with 5.0.67 and 5.1.26:
C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -proot -P3310 test
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 13
Server version: 5.1.26-rc-community-log MySQL Community Server (GPL)
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> CREATE TABLE t (s VARCHAR(2) CHARSET ucs2);
Query OK, 0 rows affected (0.44 sec)
mysql> INSERT INTO t VALUES('a'), ('ab');
Query OK, 2 rows affected (0.14 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM t ORDER BY BINARY(s) ASC;
+------+
| s |
+------+
| a |
| ab |
+------+
2 rows in set (0.02 sec)
mysql> SELECT * FROM t ORDER BY BINARY(s) DESC;
+------+
| s |
+------+
| a |
| ab |
+------+
2 rows in set (0.00 sec)
mysql> SELECT hex( BINARY(s)) FROM t ORDER BY BINARY(s) DESC;
+-----------------+
| hex( BINARY(s)) |
+-----------------+
| 0061 |
| 00610062 |
+-----------------+
2 rows in set (0.02 sec)
mysql> SELECT hex( BINARY(s)) FROM t ORDER BY BINARY(s) ASC;
+-----------------+
| hex( BINARY(s)) |
+-----------------+
| 0061 |
| 00610062 |
+-----------------+
2 rows in set (0.00 sec)
[18 Jan 2018 13:12]
Erlend Dahl
This was fixed in 5.7.8.

Description: When using a multi-byte encoding such as ucs2 for storing a varchar(x) column and converting that column to binary, only the first x bytes instead of the whole string are considered for sorting in a "ORDER BY" clause. How to repeat: CREATE TABLE t (s VARCHAR(2) CHARSET ucs2); INSERT INTO t VALUES('a'), ('ab'); SELECT * FROM t ORDER BY BINARY(s) ASC; SELECT * FROM t ORDER BY BINARY(s) DESC; -> both selects return the same order of the rows, whereas SELECT * FROM t ORDER BY s ASC SELECT * FROM t ORDER BY s DESC gives the (correct) different order. Since ucs2 uses two bytes to store a character, only the first half of the characters influences the sorting when converting to binary. Suggested fix: Consider ALL the bytes of a multi-byte encoding for binary sorting.