Bug #111268 | Different character_set_connection returns inconsistent results | ||
---|---|---|---|
Submitted: | 4 Jun 2023 17:21 | Modified: | 5 Jun 2023 8:47 |
Reporter: | Xuefeng Zhang | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S3 (Non-critical) |
Version: | 8.0.25, 8.0.33 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | Inconsistent Results |
[4 Jun 2023 17:21]
Xuefeng Zhang
[5 Jun 2023 8:47]
MySQL Verification Team
Hello Xuefeng Zhang, Thank you for the report and test case. regards, Umesh
[6 Jun 2023 11:00]
Bernt Marius Johnsen
I am only able to reproduce the erroneous behavior when the column collates as utf8mb4_bin: create table t1 (i integer, v varchar(10) collate utf8mb4_bin); insert into t1 values (1, 'abc'), (2, 'abc '); set character_set_connection = utf8mb3; select * from t1 where v = 'abc' and v = 'abc '; set character_set_connection = utf8mb4; select * from t1 where v = 'abc' and v = 'abc '; The other collations I have tried give consistent behavior and seems to be independent of collation_connection. 2 rows for PAD SPACE collations and zero rows for NO PAD collations. ========================================================================= Note that the following is correct behavior. The reason is that there is no context to determine the collation of the literals, so MySQL will use the collation_connection when the literals are compared. utf8mb3 implies utf8mb3_general_ci which is PAD SPACE while utf8mb4 implies utf8mb4_0900_ai_ci which is NO PAD. mysql> set character_set_connection = utf8mb3; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> select 'abc' = 'abc '; +----------------+ | 'abc' = 'abc ' | +----------------+ | 1 | +----------------+ 1 row in set (0.00 sec) mysql> set character_set_connection = utf8mb4; Query OK, 0 rows affected (0.00 sec) mysql> select 'abc' = 'abc '; +----------------+ | 'abc' = 'abc ' | +----------------+ | 0 | +----------------+ 1 row in set (0.00 sec)