Description:
Please see this test case.
```sql
CREATE TABLE gbk (id int, info varchar(10)) DEFAULT CHARSET = gbk;
CREATE TABLE utf (id int, info varchar(10)) DEFAULT CHARSET = utf8mb4;
INSERT INTO gbk VALUES (1, '?'), (2, 0xD7FA);
INSERT INTO utf VALUES (3, '?');
SELECT temp.id, temp.info
FROM (
SELECT g.id, u.info FROM gbk AS g JOIN utf AS u
ON g.info = u.info ) AS temp
WHERE temp.id < 2;
```
Note that `0xD7FA` is invalid in GBK.
The SQL above outputs:
```
+------+------+
| id | info |
+------+------+
| 1 | ? |
+------+------+
```
If we disable the following two optimization options:
```sql
SET optimizer_switch='derived_condition_pushdown=off,derived_merge=off';
```
Executing the same SELECT would return this error:
```
ERROR 3854 (HY000): Cannot convert string '\xD7\xFA' from gbk to utf8mb4
```
Query optimizations are expected to affect the performance only, not the result. When invalid bytes for legacy encodings are involved, we get inconsistent results for the same query under different optimizations.
We also noticed that `SELECT CONVERT(0xD7FA USING utf8mb4)` would return NULL with an encoding-related warning.
How to repeat:
$ sudo docker run -it -p 13306:3306 -e MYSQL_ROOT_PASSWORD=123456 mysql:8.4.9
$ mysql --host 127.0.0.1 -u root --password=123456 --port 13306
Then create a database and type the above test case.
Description: Please see this test case. ```sql CREATE TABLE gbk (id int, info varchar(10)) DEFAULT CHARSET = gbk; CREATE TABLE utf (id int, info varchar(10)) DEFAULT CHARSET = utf8mb4; INSERT INTO gbk VALUES (1, '?'), (2, 0xD7FA); INSERT INTO utf VALUES (3, '?'); SELECT temp.id, temp.info FROM ( SELECT g.id, u.info FROM gbk AS g JOIN utf AS u ON g.info = u.info ) AS temp WHERE temp.id < 2; ``` Note that `0xD7FA` is invalid in GBK. The SQL above outputs: ``` +------+------+ | id | info | +------+------+ | 1 | ? | +------+------+ ``` If we disable the following two optimization options: ```sql SET optimizer_switch='derived_condition_pushdown=off,derived_merge=off'; ``` Executing the same SELECT would return this error: ``` ERROR 3854 (HY000): Cannot convert string '\xD7\xFA' from gbk to utf8mb4 ``` Query optimizations are expected to affect the performance only, not the result. When invalid bytes for legacy encodings are involved, we get inconsistent results for the same query under different optimizations. We also noticed that `SELECT CONVERT(0xD7FA USING utf8mb4)` would return NULL with an encoding-related warning. How to repeat: $ sudo docker run -it -p 13306:3306 -e MYSQL_ROOT_PASSWORD=123456 mysql:8.4.9 $ mysql --host 127.0.0.1 -u root --password=123456 --port 13306 Then create a database and type the above test case.