Bug #74295 Unicode "utf8_5624_1" collation order by bug
Submitted: 9 Oct 2014 11:08 Modified: 11 Apr 2018 10:46
Reporter: shinwell lee Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Charsets Severity:S2 (Serious)
Version:5.6, 5.7.5 OS:Windows (Window 7)
Assigned to: CPU Architecture:Any
Tags: collation, order by, utf8_5624_1

[9 Oct 2014 11:08] shinwell lee
Description:
I copy the "utf8_5624_1" collation from the latest 5.6.21 version(mysql-test\std_data\Index.xml) to my index.xml file(share\charsets\Index.xml).

Then, i create a test table, the scripts is:
-----------------------------------------
CREATE TABLE `ta` (
`autoid` INT(11) NOT NULL AUTO_INCREMENT,
`tf01` VARCHAR(50) NULL DEFAULT NULL COLLATE 'utf8_5624_1',
PRIMARY KEY (`autoid`)
)
COLLATE='utf8_5624_1'
ENGINE=InnoDB;

I insert some test data into the ta table:
-----------------------------------------
INSERT INTO `ta` (`tf01`) VALUES ('00002');
INSERT INTO `ta` (`tf01`) VALUES ('00003');
INSERT INTO `ta` (`tf01`) VALUES ('00006');
INSERT INTO `ta` (`tf01`) VALUES ('02');
INSERT INTO `ta` (`tf01`) VALUES ('03');
INSERT INTO `ta` (`tf01`) VALUES ('06');
INSERT INTO `ta` (`tf01`) VALUES ('00001');
INSERT INTO `ta` (`tf01`) VALUES ('01');

Finally, i run the query script:
-----------------------------------------
select tf01 from ta order by tf01

The result of query is:
-----------------------------------------
00002
00003
00006
02
03
06
00001
01

I expect the result of query is:
-----------------------------------------
00001
00002
00003
00006
01
02
03
06

How to repeat:
As above.
[9 Oct 2014 15:11] Peter Laursen
I do not find this collation listed in documentation (http://dev.mysql.com/doc/refman/5.6/en/charset-unicode-sets.html) at all. 

Googling it finds only a few hits being mostly ~3 years old, including
http://lists.mysql.com/commits/140759 (commit by Mark Alff)
http://forums.mysql.com/read.php?103,411031,416481 (an announcement by Alex Barkov)

This is mysterious! :-)

-- Peter
-- not a MySQL/Oracle persn
[9 Oct 2014 16:35] shinwell lee
You can download the latest source code of mysql(mysql-5.6.21.zip).
The "utf8_5624_1" collation is included in the mysql-test\std_data\Index.xml.
[10 Oct 2014 1:29] shinwell lee
I delete this section, it works fine now!
---------------------------------------------------------------
<!-- long contractions and expansions -->
<reset>12345</reset><q>012345</q><q>12345</q>
<reset>1234</reset><q>001234</q><q>01234</q>
<reset>123</reset><q>000123</q><q>00123</q><q>0123</q>
<reset>12</reset><q>000012</q><q>00012</q><q>0012</q><q>012</q>
<reset>1</reset><q>000001</q><q>00001</q><q>0001</q><q>001</q><q>01</q>
[10 Oct 2014 8:03] Peter Laursen
If this collation is still supposed to be supported, shouldn't it have a more reasonable/human readable name?
[27 Nov 2014 8:12] MySQL Verification Team
Thank you for the report.

// Able to repeat the reported issue after overwriting ./sql/share/charsets/Index.xml with ./mysql-test/std_data/Index.xml 

// source build
[ushastry@cluster-repo setups]$ md5sum mysql-5.7.5-m15.tar.gz
95aa1873fd414482a1a88752cbde46bd  mysql-5.7.5-m15.tar.gz
[ushastry@cluster-repo setups]$ cd mysql-5.7.5-m15
[ushastry@cluster-repo mysql-5.7.5-m15]$ cp  ./mysql-test/std_data/Index.xml ./sql/share/charsets/Index.xml
[ushastry@cluster-repo mysql-5.7.5-m15]$ cmake . -DCMAKE_INSTALL_PREFIX=/data/ushastry/setups/mysql-5.7.5-m15 -DDOWNLOAD_BOOST=1 -DWITH_BOOST=/tmp/
[ushastry@cluster-repo mysql-5.7.5-m15]$ make
[ushastry@cluster-repo mysql-5.7.5-m15]$ make install

// start up
bin/mysql_install_db --basedir=/data/ushastry/setups/mysql-5.7.5-m15 --datadir=/tmp/bug --user=ushastry -v
bin/mysqld --basedir=/data/ushastry/setups/mysql-5.7.5-m15 --datadir=/tmp/bug --core-file --socket=/tmp/mysql.sock --port=3306 --log-error=/tmp/bug/log.err 2>&1 &

mysql> show variables like '%version%';
+-------------------------+---------------------+
| Variable_name           | Value               |
+-------------------------+---------------------+
| innodb_version          | 5.7.5               |
| protocol_version        | 10                  |
| slave_type_conversions  |                     |
| version                 | 5.7.5-m15           |
| version_comment         | Source distribution |
| version_compile_machine | x86_64              |
| version_compile_os      | Linux               |
+-------------------------+---------------------+
7 rows in set (0.00 sec)

mysql> CREATE TABLE `ta` (
    -> `autoid` INT(11) NOT NULL AUTO_INCREMENT,
    -> `tf01` VARCHAR(50) NULL DEFAULT NULL COLLATE 'utf8_5624_1',
    -> PRIMARY KEY (`autoid`)
    -> )
    -> COLLATE='utf8_5624_1'
    -> ENGINE=InnoDB;
Query OK, 0 rows affected (0.30 sec)

mysql> INSERT INTO `ta` (`tf01`) VALUES ('00002');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO `ta` (`tf01`) VALUES ('00003');
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO `ta` (`tf01`) VALUES ('00006');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO `ta` (`tf01`) VALUES ('02');
INSERT INTO `ta` (`tf01`) VALUES ('03');
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO `ta` (`tf01`) VALUES ('03');
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO `ta` (`tf01`) VALUES ('06');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO `ta` (`tf01`) VALUES ('00001');
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO `ta` (`tf01`) VALUES ('01');
Query OK, 1 row affected (0.00 sec)

mysql> select tf01 from ta order by tf01;
+-------+
| tf01  |
+-------+
| 00002 |
| 00003 |
| 00006 |
| 02    |
| 03    |
| 06    |
| 00001 |
| 01    |
+-------+
8 rows in set (0.00 sec)
[27 Nov 2014 8:13] MySQL Verification Team
with default charset/collation:

mysql> CREATE TABLE `ta1` (
    -> `autoid` INT(11) NOT NULL AUTO_INCREMENT,
    -> `tf01` VARCHAR(50) NULL DEFAULT NULL,
    -> PRIMARY KEY (`autoid`)
    -> )ENGINE=InnoDB;
Query OK, 0 rows affected (0.04 sec)

mysql> INSERT INTO `ta1` (`tf01`) VALUES ('00002');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO `ta1` (`tf01`) VALUES ('00003');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO `ta1` (`tf01`) VALUES ('00006');
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO `ta1` (`tf01`) VALUES ('02');
INSERT INTO `ta1` (`tf01`) VALUES ('03');
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO `ta1` (`tf01`) VALUES ('03');
INSERT INTO `ta1` (`tf01`) VALUES ('06');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO `ta1` (`tf01`) VALUES ('06');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO `ta1` (`tf01`) VALUES ('00001');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO `ta` (`tf01`) VALUES ('01');
Query OK, 1 row affected (0.00 sec)

mysql>
mysql> select tf01 from ta1 order by tf01;
+-------+
| tf01  |
+-------+
| 00001 |
| 00002 |
| 00003 |
| 00006 |
| 02    |
| 03    |
| 06    |
+-------+
7 rows in set (0.00 sec)
[11 Apr 2018 10:46] Erlend Dahl
Posted by developer:

[22 Jan 2018 0:53] Xing Z Zhang 

utf8_5624_1 is just for test use. User should not use it.