Bug #83999 LIKE function gives wrong result for accented characters
Submitted: 29 Nov 2016 8:08 Modified: 12 Dec 2016 18:10
Reporter: Xing Zhang Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:8.0.1 OS:Any
Assigned to: CPU Architecture:Any

[29 Nov 2016 8:08] Xing Zhang
Description:
When using LIKE function of UCA 9.0.0 collations on accented characters, we get wrong result.

For example,

mysql> create table t1(a char(10), b char(10)) collate utf8mb4_0900_ai_ci;
Query OK, 0 rows affected (0.36 sec)

mysql> insert into t1 values(_utf16 0x59, _utf16 0xdd);
Query OK, 1 row affected (0.05 sec)

mysql> select * from t1;
+------+------+
| a    | b    |
+------+------+
| Y    | Ý    |
+------+------+
1 row in set (0.00 sec)

mysql> select a=b from t1;
+------+
| a=b  |
+------+
|    1 |
+------+
1 row in set (0.00 sec)

mysql> select a like b from t1;
+----------+
| a like b |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

How to repeat:
As description.

Suggested fix:
N/A
[12 Dec 2016 18:10] Paul DuBois
Posted by developer:
 
Noted in 8.0.1 changelog.

For ai_ci collations based on Unicode Collation Algorithm 9.0.0,
accented characters that compare equal were treated as different by
LIKE comparisons.