Bug #97605 | Trailing space in joins is handled differently in recent versions of MySQL 8 | ||
---|---|---|---|
Submitted: | 13 Nov 2019 6:07 | Modified: | 21 Nov 2019 7:48 |
Reporter: | Yoseph Phillips | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 8.0.18, 8.0.11 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[13 Nov 2019 6:07]
Yoseph Phillips
[13 Nov 2019 6:27]
MySQL Verification Team
Hello Yoseph Phillips, Thank you for the report and feedback. regards, Umesh
[13 Nov 2019 18:35]
Steinar Gunderson
Hi, This is documented here: https://dev.mysql.com/doc/refman/8.0/en/charset-binary-collations.html#charset-binary-coll...
[14 Nov 2019 0:01]
Yoseph Phillips
That documentation does not talk about this. The issue here is caused by the new hash join optimisation. If one does: SELECT * FROM test t1 INNER JOIN test t2 ON t2.testValue = t1.testValue; while using the hash join optimisation it gives different results than documented. If one calls SET optimizer_switch = "hash_join=off"; first than it gives the results as documented.
[20 Nov 2019 23:04]
Jon Stephens
Documented fix as follows in the MySQL 8.0.19 changelog: When creating hash values used for lookups during a hash join, the server did not respect the PAD SPACE attribute, meaning that 'foo' and 'foo ' would not match in a PAD SPACE collation. This is fixed this by padding all strings up to the same length as the longest possible string, where the longest possible string is deduced from the data type length specifier, such as N in CHAR(N) or VARCHAR(N). Closed.
[21 Nov 2019 7:48]
Yoseph Phillips
Maybe it would be better for performance and simpler to trim these values instead so that the lookup value of 'foo ' is 'foo'.