Bug #99413 | Constant propagation get the wrong result when mix with the different collations | ||
---|---|---|---|
Submitted: | 30 Apr 2020 17:13 | Modified: | 6 May 2020 13:30 |
Reporter: | wj huang | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | 8.0.19 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[30 Apr 2020 17:13]
wj huang
[5 May 2020 13:19]
MySQL Verification Team
Hello Mr. huang, Thank you for your bug report. However, I am in doubt that this is truly a bug. You are mixing here collations with different weights and in expressions of different coercibilities. You simply can not expect that different collations return the same result, in different expressions. When you make small changes in your SELECT queries, then these problem disappear. Like with these queries : select * from t t1, t t2 where t1.a= t2.b collate utf8mb4_bin and t2.b='a' collate utf8mb4_0900_ai_ci ; select * from t t1, t t2 where t1.a= t2.b collate utf8mb4_bin; Let us know whether you understood the point.
[5 May 2020 14:11]
wj huang
Hi, Sinisa Milivojevic. I do not quite understand the point here, would you mean that the behaviour is undefined? I have to admit that the behaviour is quite confusing, adding a condition into the query will return extra row.
[5 May 2020 15:07]
MySQL Verification Team
Hi Mr. huang, No, the behaviour is quite defined. But, behaviour is not defined by constant propagation itself, but more by different weights and coercibilities of the collations, as used here in different expressions. This is why my queries give consistent results in both cases.
[5 May 2020 15:40]
wj huang
Hi, Sinisa Milivojevic. Could you explain how the result is defined by different weights and coercibilities of the collations? I tried to change the utf8mb4_0900_ai_ci to utf8mb4_general_ci: create table t (a char(10) collate utf8mb4_bin, b char(10) collate utf8mb4_general_ci); insert into t values ('a', 'A'); set names utf8mb4 collate utf8mb4_general_ci; mysql> select * from t t1, t t2 where t1.a=t2.b and t2.b='a'; Empty set (0.00 sec) I also tried utf8mb4_da_0900_ai_ci, and got empty result too. Why the utf8mb4_0900_ai_ci is different?
[6 May 2020 12:27]
MySQL Verification Team
Hello Mr. huang, We had a short discussion with Development on your report and it is our conclusion that this is not a bug. This is the justification: t2.b='a' is always true, so the query may be simplified to: select * from t t1, t t2 where t1.a=t2.b; Empty set (0,00 sec) And since t1.a = t1.b is done in utf8mb4_bin, the result is correct: select a,b,a=b from t; +------+------+------+ | a | b | a=b | +------+------+------+ | a | A | 0 | +------+------+------+ 1 row in set (0,00 sec) If you add a collate statement, you get what is expected: select * from t t1, t t2 where t1.a collate utf8mb4_0900_ai_ci=t2.b and t2.b='a'; +------+------+------+------+ | a | b | a | b | +------+------+------+------+ | a | A | a | A | +------+------+------+------+ 1 row in set (0,00 sec) Regarding your other questions regarding Unicode character sets and different between standard collations, please read our Reference Manual and some textbook on Unicode encodings, character sets, collations, weights and coercibilities. Not a bug.
[6 May 2020 12:47]
wj huang
Hi, Sinisa Milivojevic. Thanks for your justification, and I really agree with your justification. Since t2.b='a' is always true, `select * from t t1, t t2 where t1.a=t2.b and t2.b='a';` can be simplified to `select * from t t1, t t2 where t1.a=t2.b;` which returns empty. However, What I got is one row, not empty, I think this is not consistent with the justification. What I expect is empty, not one row.
[6 May 2020 12:54]
MySQL Verification Team
Hi Mr. Huang, The difference is that we tested the results with current, internal release of our server, which contains patches for 8.0.21 and 8.0.22. Hence, the confusion. In short, you will see different results with one of those two future releases. We do not know when will either be out, since 8.0.20 has just been released.
[6 May 2020 13:30]
wj huang
Hi, Sinisa Milivojevic. That makes sense, I tested it in 8.0.19 and 8.0.20, both of these versions returned one row. I hope I will see the correct result in the next release. Thanks for your reply.
[6 May 2020 13:59]
MySQL Verification Team
You are truly welcome.