Bug #39287 Binary conversion of a multi-byte encoded string leads to wrong sort order
Submitted: 6 Sep 2008 12:56 Modified: 18 Jan 2018 13:12
Reporter: Andreas Peer Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:5.0.51a, 5.0.67, 5.1.26 OS:Linux (Ubuntu Hardy 8.04 AMD64)
Assigned to: Assigned Account CPU Architecture:Any
Tags: BINARY, multi-byte encoding, order by, sorting, ucs2

[6 Sep 2008 12:56] Andreas Peer
Description:
When using a multi-byte encoding such as ucs2 for storing a varchar(x) column and converting that column to binary, only the first x bytes instead of the whole string are considered for sorting in a "ORDER BY" clause.

How to repeat:
CREATE TABLE t (s VARCHAR(2) CHARSET ucs2);
INSERT INTO t VALUES('a'), ('ab');

SELECT * FROM t ORDER BY BINARY(s) ASC;
SELECT * FROM t ORDER BY BINARY(s) DESC;
-> both selects return the same order of the rows, whereas

SELECT * FROM t ORDER BY s ASC
SELECT * FROM t ORDER BY s DESC
gives the (correct) different order.

Since ucs2 uses two bytes to store a character, only the first half of the characters influences the sorting when converting to binary.

Suggested fix:
Consider ALL the bytes of a multi-byte encoding for binary sorting.
[7 Sep 2008 18:48] Valeriy Kravchuk
Thank you for a problem report. Verified just as described with 5.0.67 and 5.1.26:

C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -proot -P3310 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 13
Server version: 5.1.26-rc-community-log MySQL Community Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> CREATE TABLE t (s VARCHAR(2) CHARSET ucs2);
Query OK, 0 rows affected (0.44 sec)

mysql> INSERT INTO t VALUES('a'), ('ab');
Query OK, 2 rows affected (0.14 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM t ORDER BY BINARY(s) ASC;
+------+
| s    |
+------+
| a    |
| ab   |
+------+
2 rows in set (0.02 sec)

mysql> SELECT * FROM t ORDER BY BINARY(s) DESC;
+------+
| s    |
+------+
| a    |
| ab   |
+------+
2 rows in set (0.00 sec)

mysql> SELECT hex( BINARY(s)) FROM t ORDER BY BINARY(s) DESC;
+-----------------+
| hex( BINARY(s)) |
+-----------------+
| 0061            |
| 00610062        |
+-----------------+
2 rows in set (0.02 sec)

mysql> SELECT hex( BINARY(s)) FROM t ORDER BY BINARY(s) ASC;
+-----------------+
| hex( BINARY(s)) |
+-----------------+
| 0061            |
| 00610062        |
+-----------------+
2 rows in set (0.00 sec)
[18 Jan 2018 13:12] Erlend Dahl
This was fixed in 5.7.8.