Bug #118510 | InnoDB Match Against does not work properly with utf8mb4_0900_as_ci | ||
---|---|---|---|
Submitted: | 23 Jun 20:09 | Modified: | 1 Jul 12:42 |
Reporter: | Rudolf Run-Oracle | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: FULLTEXT search | Severity: | S2 (Serious) |
Version: | 8.0.39 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | fulltext, innodb, utf8mb4, utf8mb4_0900_as_ci |
[23 Jun 20:09]
Rudolf Run-Oracle
[1 Jul 11:51]
MySQL Verification Team
mysql [localhost:8042] {msandbox} (test) > CREATE TABLE `FT_TestTable` ( -> `ID` int NOT NULL AUTO_INCREMENT, -> `LastName` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_ci DEFAULT NULL, -> PRIMARY KEY (`ID`), -> FULLTEXT KEY `FT_TestTable_LastName_IDX` (`LastName`) -> ) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_as_ci; Query OK, 0 rows affected (0.45 sec) mysql [localhost:8042] {msandbox} (test) > INSERT INTO FT_TestTable (LastName) VALUES ('König'), ('Kónig'), ('Königs'), ('Königsberger'), ('John'), ('Jóhn'), ('Johnson'), ('Jóhnson') -> ; Query OK, 8 rows affected (0.01 sec) Records: 8 Duplicates: 0 Warnings: 0 mysql [localhost:8042] {msandbox} (test) > SELECT -> IF(MATCH(u.lastname) AGAINST ('+John*' IN BOOLEAN MODE) > 0, 1, 0) as 'MATCH_LastName', -> IF(u.lastname LIKE 'John%' > 0, 1, 0) as 'LIKE_lastname', -> u.lastname -> FROM FT_TestTable u WHERE -> MATCH(u.lastname) AGAINST ('+John*' IN BOOLEAN MODE) -> OR -> u.lastname LIKE 'John%'; +----------------+---------------+----------+ | MATCH_LastName | LIKE_lastname | lastname | +----------------+---------------+----------+ | 1 | 1 | John | | 0 | 1 | Johnson | +----------------+---------------+----------+ 2 rows in set (0.01 sec) mysql [localhost:8042] {msandbox} (test) > select @@version -> ; +-----------+ | @@version | +-----------+ | 8.0.42 | +-----------+ 1 row in set (0.00 sec) mysql [localhost:8042] {msandbox} (test) > drop table FT_TestTable; Query OK, 0 rows affected (0.11 sec) mysql [localhost:8042] {msandbox} (test) > CREATE TABLE `FT_TestTable` ( -> `ID` int NOT NULL AUTO_INCREMENT, -> `LastName` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL, -> PRIMARY KEY (`ID`), -> FULLTEXT KEY `FT_TestTable_LastName_IDX` (`LastName`) -> ) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci; Query OK, 0 rows affected (0.46 sec) mysql [localhost:8042] {msandbox} (test) > INSERT INTO FT_TestTable (LastName) VALUES ('König'), ('Kónig'), ('Königs'), ('Königsberger'), ('John'), ('Jóhn'), ('Johnson'), ('Jóhnson') -> ; Query OK, 8 rows affected (0.01 sec) Records: 8 Duplicates: 0 Warnings: 0 mysql [localhost:8042] {msandbox} (test) > SELECT u.lastname FROM FT_TestTable u WHERE -> IF(MATCH(u.lastname) AGAINST ('+John*' IN BOOLEAN MODE) > 0, 1, 0) as 'MATCH_LastName', -> IF(u.lastname LIKE 'John%' > 0, 1, 0) as 'LIKE_lastname', -> u.lastname -> FROM FT_TestTable u WHERE -> MATCH(u.lastname) AGAINST ('+John*' IN BOOLEAN MODE) -> OR -> u.lastname LIKE 'John%'; +----------------+---------------+----------+ | MATCH_LastName | LIKE_lastname | lastname | +----------------+---------------+----------+ | 1 | 1 | John | | 1 | 1 | Jóhn | | 1 | 1 | Johnson | | 1 | 1 | Jóhnson | +----------------+---------------+----------+ 4 rows in set (0.00 sec) mysql [localhost:8042] {msandbox} (test) >
[1 Jul 11:52]
MySQL Verification Team
Please upgrade. Thank you for using MySQL Server.
[1 Jul 12:20]
MySQL Verification Team
Server version: 9.3.0 MySQL Community Server - GPL Copyright (c) 2000, 2025, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql [localhost:9300] {msandbox} ((none)) > use test Database changed mysql [localhost:9300] {msandbox} (test) > CREATE TABLE `FT_TestTable` ( -> `ID` int NOT NULL AUTO_INCREMENT, -> `LastName` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL, -> PRIMARY KEY (`ID`), -> FULLTEXT KEY `FT_TestTable_LastName_IDX` (`LastName`) -> ) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci; Query OK, 0 rows affected (0.853 sec) mysql [localhost:9300] {msandbox} (test) > INSERT INTO FT_TestTable (LastName) VALUES ('König'), ('Kónig'), ('Königs'), ('Königsberger'), ('John'), ('Jóhn'), ('Johnson'), ('Jóhnson') -> ; Query OK, 8 rows affected (0.007 sec) Records: 8 Duplicates: 0 Warnings: 0 mysql [localhost:9300] {msandbox} (test) > SELECT FROM FT_TestTable u WHERE -> IF(MATCH(u.lastname) AGAINST ('+John*' IN BOOLEAN MODE) > 0, 1, 0) as 'MATCH_LastName', -> IF(u.lastname LIKE 'John%' > 0, 1, 0) as 'LIKE_lastname', -> u.lastname -> FROM FT_TestTable u WHERE -> MATCH(u.lastname) AGAINST ('+John*' IN BOOLEAN MODE) -> OR -> u.lastname LIKE 'John%'; +----------------+---------------+----------+ | MATCH_LastName | LIKE_lastname | lastname | +----------------+---------------+----------+ | 1 | 1 | John | | 1 | 1 | Jóhn | | 1 | 1 | Johnson | | 1 | 1 | Jóhnson | +----------------+---------------+----------+ 4 rows in set (0.001 sec) mysql [localhost:9300] {msandbox} (test) >
[1 Jul 12:42]
MySQL Verification Team
There is def. something not ideal when accent sensitive collation is used. With accent insensitive collation everything works ok but with accent sensitive I have some weird results. I will verify the bug. I suggest you upgrade non the less and see if accent insensitive collation works for you as a workaround. Thank you for the report
[1 Jul 12:43]
MySQL Verification Team
mysql [localhost:8042] {msandbox} (test) > CREATE TABLE `FT_TestTable` ( -> `ID` int NOT NULL AUTO_INCREMENT, -> `LastName` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_ci DEFAULT NULL, -> PRIMARY KEY (`ID`), -> FULLTEXT KEY `FT_TestTable_LastName_IDX` (`LastName`) -> ) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_as_ci; Query OK, 0 rows affected (0.44 sec) mysql [localhost:8042] {msandbox} (test) > INSERT INTO FT_TestTable (LastName) VALUES ('König'), ('Kónig'), ('Königs'), ('Königsberger'), ('John'), ('Jóhn'), ('Johnson'), ('Jóhnson'); Query OK, 8 rows affected (0.02 sec) Records: 8 Duplicates: 0 Warnings: 0 mysql [localhost:8042] {msandbox} (test) > select MATCH(LastName) AGAINST('Joh' IN BOOLEAN MODE) FROM FT_TestTable; +------------------------------------------------+ | MATCH(LastName) AGAINST('Joh' IN BOOLEAN MODE) | +------------------------------------------------+ | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | +------------------------------------------------+ 8 rows in set (0.01 sec) mysql [localhost:8042] {msandbox} (test) > select MATCH(LastName) AGAINST('Joh*' IN BOOLEAN MODE) FROM FT_TestTable; +-------------------------------------------------+ | MATCH(LastName) AGAINST('Joh*' IN BOOLEAN MODE) | +-------------------------------------------------+ | 0 | | 0 | | 0 | | 0 | | 0.8155715465545654 | | 0 | | 0 | | 0 | +-------------------------------------------------+ 8 rows in set (0.00 sec) mysql [localhost:8042] {msandbox} (test) > select LastName, MATCH(LastName) AGAINST('Joh*' IN BOOLEAN MODE) FROM FT_TestTable; +---------------+-------------------------------------------------+ | LastName | MATCH(LastName) AGAINST('Joh*' IN BOOLEAN MODE) | +---------------+-------------------------------------------------+ | König | 0 | | Kónig | 0 | | Königs | 0 | | Königsberger | 0 | | John | 0.8155715465545654 | | Jóhn | 0 | | Johnson | 0 | | Jóhnson | 0 | +---------------+-------------------------------------------------+ 8 rows in set (0.00 sec) mysql [localhost:8042] {msandbox} (test) > select LastName, MATCH(LastName) AGAINST('+Joh*' IN BOOLEAN MODE) FROM FT_TestTable; +---------------+--------------------------------------------------+ | LastName | MATCH(LastName) AGAINST('+Joh*' IN BOOLEAN MODE) | +---------------+--------------------------------------------------+ | König | 0 | | Kónig | 0 | | Königs | 0 | | Königsberger | 0 | | John | 0.8155715465545654 | | Jóhn | 0 | | Johnson | 0 | | Jóhnson | 0 | +---------------+--------------------------------------------------+ 8 rows in set (0.00 sec)
[1 Jul 12:43]
MySQL Verification Team
mysql [localhost:8042] {msandbox} (test) > INSERT INTO FT_TestTable (LastName) VALUES ('Johxxx'), ('johan'), ('johhhhhhh'); Query OK, 3 rows affected (0.03 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql [localhost:8042] {msandbox} (test) > select LastName, MATCH(LastName) AGAINST('+Joh*' IN BOOLEAN MODE) FROM FT_TestTable; +---------------+--------------------------------------------------+ | LastName | MATCH(LastName) AGAINST('+Joh*' IN BOOLEAN MODE) | +---------------+--------------------------------------------------+ | König | 0 | | Kónig | 0 | | Königs | 0 | | Königsberger | 0 | | John | 0.3184022605419159 | | Jóhn | 0 | | Johnson | 0 | | Jóhnson | 0 | | Johxxx | 0 | | johan | 0.3184022605419159 | | johhhhhhh | 0.3184022605419159 | +---------------+--------------------------------------------------+ 11 rows in set (0.00 sec) mysql [localhost:8042] {msandbox} (test) > INSERT INTO FT_TestTable (LastName) VALUES ('Johxxx'), ('Johan'), ('Johhhhhhh'); Query OK, 3 rows affected (0.01 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql [localhost:8042] {msandbox} (test) > select LastName, MATCH(LastName) AGAINST('+Joh*' IN BOOLEAN MODE) FROM FT_TestTable; +---------------+--------------------------------------------------+ | LastName | MATCH(LastName) AGAINST('+Joh*' IN BOOLEAN MODE) | +---------------+--------------------------------------------------+ | König | 0 | | Kónig | 0 | | Königs | 0 | | Königsberger | 0 | | John | 0.19995030760765076 | | Jóhn | 0 | | Johnson | 0 | | Jóhnson | 0 | | Johxxx | 0 | | johan | 0.19995030760765076 | | johhhhhhh | 0.19995030760765076 | | Johxxx | 0 | | Johan | 0.19995030760765076 | | Johhhhhhh | 0.19995030760765076 | +---------------+--------------------------------------------------+ 14 rows in set (0.00 sec) mysql [localhost:8042] {msandbox} (test) > INSERT INTO FT_TestTable (LastName) VALUES ('Johnson'), ('Johnson'); Query OK, 2 rows affected (0.14 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql [localhost:8042] {msandbox} (test) > select LastName, MATCH(LastName) AGAINST('+Joh*' IN BOOLEAN MODE) FROM FT_TestTable; +---------------+--------------------------------------------------+ | LastName | MATCH(LastName) AGAINST('+Joh*' IN BOOLEAN MODE) | +---------------+--------------------------------------------------+ | König | 0 | | Kónig | 0 | | Königs | 0 | | Königsberger | 0 | | John | 0.25517651438713074 | | Jóhn | 0 | | Johnson | 0 | | Jóhnson | 0 | | Johxxx | 0 | | johan | 0.25517651438713074 | | johhhhhhh | 0.25517651438713074 | | Johxxx | 0 | | Johan | 0.25517651438713074 | | Johhhhhhh | 0.25517651438713074 | | Johnson | 0 | | Johnson | 0 | +---------------+--------------------------------------------------+ 16 rows in set (0.05 sec) mysql [localhost:8042] {msandbox} (test) > select LastName, MATCH(LastName) AGAINST('+Joh*' IN BOOLEAN MODE) FROM FT_TestTable; +---------------+--------------------------------------------------+ | LastName | MATCH(LastName) AGAINST('+Joh*' IN BOOLEAN MODE) | +---------------+--------------------------------------------------+ | König | 0 | | Kónig | 0 | | Königs | 0 | | Königsberger | 0 | | John | 0.25517651438713074 | | Jóhn | 0 | | Johnson | 0 | | Jóhnson | 0 | | Johxxx | 0 | | johan | 0.25517651438713074 | | johhhhhhh | 0.25517651438713074 | | Johxxx | 0 | | Johan | 0.25517651438713074 | | Johhhhhhh | 0.25517651438713074 | | Johnson | 0 | | Johnson | 0 | +---------------+--------------------------------------------------+ 16 rows in set (0.18 sec) mysql [localhost:8042] {msandbox} (test) > select LastName, MATCH(LastName) AGAINST('+joh*' IN BOOLEAN MODE) FROM FT_TestTable; +---------------+--------------------------------------------------+ | LastName | MATCH(LastName) AGAINST('+joh*' IN BOOLEAN MODE) | +---------------+--------------------------------------------------+ | König | 0 | | Kónig | 0 | | Königs | 0 | | Königsberger | 0 | | John | 0.25517651438713074 | | Jóhn | 0 | | Johnson | 0 | | Jóhnson | 0 | | Johxxx | 0 | | johan | 0.25517651438713074 | | johhhhhhh | 0.25517651438713074 | | Johxxx | 0 | | Johan | 0.25517651438713074 | | Johhhhhhh | 0.25517651438713074 | | Johnson | 0 | | Johnson | 0 | +---------------+--------------------------------------------------+ 16 rows in set (0.01 sec) mysql [localhost:8042] {msandbox} (test) >