Bug #111301 The query results in the CASE WHEN statement are inconsistent.
Submitted: 6 Jun 2023 14:44 Modified: 7 Jun 2023 12:55
Reporter: Chi Zhang Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:8.0.22 OS:Any
Assigned to: CPU Architecture:Any
Tags: innodb buffer pool

[6 Jun 2023 14:44] Chi Zhang
Description:
After mysqld is restarted, the query results of the CASE WHEN statement are inconsistent.

How to repeat:
1.Import the data. The required files are attached.

mysql> source tbl_1_all_type2index_desc.sql
mysql> source tbl_3_all_type2index_desc.sql

Attention, to reproduce the problem stably, the mysql_random_data_load tool is used to insert a large amount of random data into the preceding two SQL files.
The tool can be downloaded from the following website: https://github.com/Percona-Lab/mysql_random_data_load .Or download it in the attachment. 
This tool needs to be stored in the directory pointed to by the basedir system variable.
On line 84 of the tbl_1_all_type2index_desc.sql file and line 88 of the tbl_3_all_type2index_desc.sql file, the tool is used, just like this:

\! ./mysql_random_data_load test tbl_1_all_type2index_desc 4000 --host=127.0.0.1 --user=root --port=3306 --password=xxx

Modify the two lines according to the actual situation.

2.Then, run the following statement:
mysql> WITH A AS (SELECT int_col, char_col FROM tbl_3_all_type2index_desc A WHERE 0=0 AND A.char_col > 'aaaa') SELECT COMPRESS(A.int_col), A.char_col, CASE WHEN STRCMP(COMPRESS(A.int_col), A.char_col) THEN COMPRESS(A.int_col) ELSE A.char_col END AS char_col FROM A JOIN tbl_1_all_type2index_desc B ON A.int_col=B.int_col AND A.char_col=B.char_col WHERE 0=0 OR B.int_col IS NOT UNKNOWN ORDER BY 1,2 LIMIT 10;
+------------------------------------------+----------+---------------+
| COMPRESS(A.int_col)                      | char_col | char_col      |
+------------------------------------------+----------+---------------+
| 0x01000000789C33000000310031             | dol      |    x�3   1 1  |
| 0x01000000789C33000000310031             | dol      |    x�3   1 1  |
| 0x01000000789C33000000310031             | dol      |    x�3   1 1  |
| 0x01000000789C33000000310031             | dol      |    x�3   1 1  |
| 0x01000000789C33000000310031             | dol      |    x�3   1 1  |
| 0x01000000789C33000000310031             | dol      |    x�3   1 1  |
| 0x01000000789C33000000310031             | dol      |    x�3   1 1  |
| 0x01000000789C33000000310031             | dol      |    x�3   1 1  |
| 0x01000000789C33000000310031             | eos      |    x�3   1 1  |
| 0x01000000789C33000000310031             | fug      |    x�3   1 1  |
+------------------------------------------+----------+---------------+
10 rows in set (0.30 sec)

Restart mysqld to clear the innodb buffer pool cache. Run the statement again:
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed

mysql> WITH A AS (SELECT int_col, char_col FROM tbl_3_all_type2index_desc A WHERE 0=0 AND A.char_col > 'aaaa') SELECT COMPRESS(A.int_col), A.char_col, CASE WHEN STRCMP(COMPRESS(A.int_col), A.char_col) THEN COMPRESS(A.int_col) ELSE A.char_col END AS char_col FROM A JOIN tbl_1_all_type2index_desc B ON A.int_col=B.int_col AND A.char_col=B.char_col WHERE 0=0 OR B.int_col IS NOT UNKNOWN ORDER BY 1,2 LIMIT 10;
+------------------------------------------+----------+----------+
| COMPRESS(A.int_col)                      | char_col | char_col |
+------------------------------------------+----------+----------+
| 0x01000000789C33000000310031             | dol      |    x    |
| 0x01000000789C33000000310031             | dol      |    x    |
| 0x01000000789C33000000310031             | dol      |    x    |
| 0x01000000789C33000000310031             | dol      |    x    |
| 0x01000000789C33000000310031             | dol      |    x    |
| 0x01000000789C33000000310031             | dol      |    x    |
| 0x01000000789C33000000310031             | dol      |    x    |
| 0x01000000789C33000000310031             | dol      |    x    |
| 0x01000000789C33000000310031             | eos      |    x    |
| 0x01000000789C33000000310031             | fug      |    x    |
+------------------------------------------+----------+----------+
10 rows in set, 65535 warnings (1 min 23.78 sec)

You can see that the results in the third column of the two executions are inconsistent.Although your results may not be the same as mine due to the influence of the inserted random data, the inconsistency can be seen.

3.I also tested the problem in version 8.0.33, and unfortunately the statement did not execute successfully.
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed

mysql> WITH A AS (SELECT int_col, char_col FROM tbl_3_all_type2index_desc A WHERE 0=0 AND A.char_col > 'aaaa') SELECT COMPRESS(A.int_col), A.char_col, CASE WHEN STRCMP(COMPRESS(A.int_col), A.char_col) THEN COMPRESS(A.int_col) ELSE A.char_col END AS char_col FROM A JOIN tbl_1_all_type2index_desc B ON A.int_col=B.int_col AND A.char_col=B.char_col WHERE 0=0 OR B.int_col IS NOT UNKNOWN ORDER BY 1,2 LIMIT 10;
ERROR 3854 (HY000): Cannot convert string '\x01\x00\x00\x00x\x9C...' from binary to utf8mb4
[6 Jun 2023 14:59] Chi Zhang
SQL file used to reproduce the problem

Attachment: sql_files.7z (application/octet-stream, text), 14.82 KiB.

[6 Jun 2023 15:04] Chi Zhang
The mysql_random_data_load tool cannot be uploaded. Download it here: https://github.com/Percona-Lab/mysql_random_data_load/releases/tag/v0.1.12
[7 Jun 2023 12:55] MySQL Verification Team
Hi Mr. Zhang,

Thank you for your bug report.

However, we have to inform you about our policy of bug testing.

We only use the latest release of the version that is supported. That means that for current version, version 8.0, we only test reports on 8.0.33.

The result sets returned by previous releases are, seemingly, wrong, but that is irrelevant, since only the results from 8.0.33 are counted.

The error that you get from 8.0.33 is a valid one. You can not convert between the two character sets, as reported. Hence, you have to go back and change definition for the one of the two columns involved, by changing  the character-set appropriately.

You either change from it from binary or to binary character sets. You could try using introducers (as explained in our Manual), but we do not think it can work, since those two character sets are incompatible !!!!

This is all explained in our Reference Manual.

Not a bug.