Bug #102086 the result of order by cast(field1 as char) is unexpect.
Submitted: 29 Dec 2020 12:06 Modified: 31 Dec 2020 1:15
Reporter: GONGTUI FU Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:8.0.18 OS:Red Hat (7.4)
Assigned to: CPU Architecture:Any
Tags: order by concat

[29 Dec 2020 12:06] GONGTUI FU
Description:
Hi, guys:
I might find out a bug about CAST function.
Assume a varchar field named 'a' includes '+', '-', when usinig ORDER BY CAST(a AS char) clause, the result is unexpect.

How to repeat:
create table t(a varchar(4));
insert into t value('+123'), ('-123');

mysql> select ascii('+'), ascii('-') from dual;
+------------+------------+
| ascii('+') | ascii('-') |
+------------+------------+
|         43 |         45 |
+------------+------------+

mysql> select * from t order by a;
+------+
| a    |
+------+
| +123 |
| -123 |
+------+

mysql> select * from t order by cast(a as char);
+------+
| a    |
+------+
| -123 |
| +123 |
+------+

Since '+' is less then '-', why CAST function takes '-' first by asc?

In oracle database, the results are in line with the expectation.

SQL> select * from t order by a;

A
----
+123
-123

SQL> select * from t order by cast(a as char);

A
----
+123
-123

SQL>  select ascii('+'), ascii('-') from dual;

ASCII('+') ASCII('-')
---------- ----------
        43         45
[29 Dec 2020 12:25] MySQL Verification Team
Thank you for the bug report. Not repeatable with current released version 8.0.22:

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.22 MySQL Community Server - GPL

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE DATABASE yyy;
Query OK, 1 row affected (0.06 sec)

mysql> USE yyy
Database changed
mysql> create table t(a varchar(4));
Query OK, 0 rows affected (0.23 sec)

mysql> insert into t value('+123'), ('-123');
Query OK, 2 rows affected (0.02 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from t order by cast(a as char);
+------+
| a    |
+------+
| +123 |
| -123 |
+------+
2 rows in set (0.01 sec)

mysql>
[29 Dec 2020 12:33] MySQL Verification Team
Thank you for the bug report. Checking on Linux, now:

Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.22 MySQL Community Server - GPL

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE DATABASE yyy;
Query OK, 1 row affected (0.01 sec)

mysql> USE yyy
Database changed
mysql> create table t(a varchar(4));
Query OK, 0 rows affected (0.02 sec)

mysql> insert into t value('+123'), ('-123');
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql>  select * from t order by cast(a as char);
+------+
| a    |
+------+
| -123 |
| +123 |
+------+
2 rows in set (0.00 sec)

mysql>
[31 Dec 2020 1:15] GONGTUI FU
After debugging, it is found that the code that produces abnormal results is default_collation_for_utf8mb4: utf8mb4_0900_ai_ci, utf8mb4_0900_ai_ci has a weight uca900_p000 which is a short array defining in file 'strings/uca900_data.h'.
uca900_p000 is a mapping of ASCII,The definition is as follows:
  0x0616,   /* U+002B */  ascii '+'
  0x0222,   /* U+002C */
  0x020D,   /* U+002D */  ascii '-'
  0x0277,   /* U+002E */
  0x0394,   /* U+002F */
  0x1C3D,   /* U+0030 */
Can you explain why '+' mapping 0x0616 and '-' mapping 0x020d? thanks.