| Bug #104571 | Strings with ASCII charset and UCS2 don't match in join statement in MySQL 8 | ||
|---|---|---|---|
| Submitted: | 9 Aug 2021 9:37 | Modified: | 30 Aug 2021 17:10 |
| Reporter: | Andrey Zaytcev | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Charsets | Severity: | S2 (Serious) |
| Version: | MySQL 8, 8.0.26 | OS: | Windows |
| Assigned to: | CPU Architecture: | x86 | |
[9 Aug 2021 10:40]
Andrey Zaytcev
Change severity to S2
[9 Aug 2021 10:49]
MySQL Verification Team
Hello Andrey Zaytcev, Thank you for the report and test case. regards, Umesh
[30 Aug 2021 17:10]
Jon Stephens
Documented fix as follows in the MySQL 8.0.27 changelog:
Otherwise identical strings, using, respectively, the ASCII
(collation ascii_general_ci) and UCS2 (collation
ucs2_general_ci) character sets did not match as expected in
join conditions.
Closed.

Description: In MySQL 8 strings with ASCII charset (collate ascii_general_ci) and UCS2 (collate ucs2_general_ci) don't match in join statement. How to repeat: Here is small steps to repeat it: 1. Create two tables with varchar(64) fields and ascii (collate ascii_general_ci) and ucs2 (collate ucs2_general_ci) charsets respectively: DROP TEMPORARY TABLE IF EXISTS `__ttmp_unicode`; CREATE TEMPORARY TABLE IF NOT EXISTS `__ttmp_unicode` ( `wstrName` VARCHAR(256) CHARACTER SET ucs2 COLLATE ucs2_general_ci DEFAULT NULL) -- `wstrName` VARCHAR(256) CHARACTER SET ucs2 COLLATE ucs2_unicode_ci DEFAULT NULL) ENGINE=InnoDB; DROP TEMPORARY TABLE IF EXISTS `__ttmp_ascii`; CREATE TEMPORARY TABLE IF NOT EXISTS `__ttmp_ascii` ( `strName` VARCHAR(64) ) ENGINE=InnoDB DEFAULT CHARACTER SET `ascii` COLLATE `ascii_general_ci`; 2. Insert same values in tables: INSERT INTO `__ttmp_unicode`(wstrName) VALUES('9337a53f-cfd6-4a68-a1fc-e2d84eba8079'); INSERT INTO `__ttmp_ascii`(strName) VALUES('9337a53f-cfd6-4a68-a1fc-e2d84eba8079'); 3. Check inserted values: SELECT * FROM `__ttmp_unicode`; SELECT * FROM `__ttmp_ascii`; 4. Try to join them: SELECT `wstrName`, `strName` FROM `__ttmp_unicode` INNER JOIN `__ttmp_ascii` ON `__ttmp_unicode`.`wstrName` = `__ttmp_ascii`.`strName`; With MySQL 8 the result is empty table. But with MySQL 5.7 join statement works well and the result is non-empty table: # wstrName, strName '9337a53f-cfd6-4a68-a1fc-e2d84eba8079', '9337a53f-cfd6-4a68-a1fc-e2d84eba8079' I also find out that replacing "ucs2_general_ci" collate with "ucs2_unicode_ci" makes join works well like MySQL 5.7. You can see this by uncommenting the line "`wstrName` VARCHAR(256) CHARACTER SET ucs2 COLLATE ucs2_unicode_ci DEFAULT NULL)" in table `__ttmp_unicode` definition