Bug #94640 InnoDB ngram parser fulltext search yield wrong result
Submitted: 13 Mar 2019 10:13 Modified: 15 Mar 2019 7:02
Reporter: Seunguck Lee Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: FULLTEXT search Severity:S3 (Non-critical)
Version:8.0.15, 5.7.25 OS:CentOS (CentOS Linux release 7.0.1406 (Core))
Assigned to: CPU Architecture:x86 (Linux mysql_test 3.10.0-229.14.1.el7.centos.plus.x86_64)

[13 Mar 2019 10:13] Seunguck Lee
Description:
MySQL ngram fulltext search from innodb yields wrong result.

CREATE TABLE fts_test(
  code varchar(20) NOT NULL,
  name varchar(80) NOT NULL,
  PRIMARY KEY (code),
  FULLTEXT KEY fx_name(name) WITH PARSER ngram
) ENGINE=InnoDB;

These two query yield correct result.
  sql> select * from fts_test where match(name) against ('YEL' in boolean mode);
  sql> select * from fts_test where match(name) against ('WEL' in boolean mode);

But this query yield wrong result. (Only the difference is first character of search keyword, 'Y' 'W' ==> 'D')
  sql> select * from fts_test where match(name) against ('WEL' in boolean mode);

How to repeat:
CREATE TABLE fts_test(
  code varchar(20) NOT NULL,
  name varchar(80) NOT NULL,
  PRIMARY KEY (code),
  FULLTEXT KEY fx_name(name) WITH PARSER ngram
) ENGINE=InnoDB;

insert into fts_test values ('WEL','WELLINGTON');
insert into fts_test values ('WFD','WHITEFIELD');
insert into fts_test values ('YGL','YELGUR');
insert into fts_test values ('YNK','YELHANKA JN');
insert into fts_test values ('DLI','DELHI');
insert into fts_test values ('DAZ','DELHI AZADPUR');

select * from fts_test where match(name) against ('DEL' in boolean mode);
+------+---------------+
| code | name          |
+------+---------------+
| WEL  | WELLINGTON    |
| WFD  | WHITEFIELD    |
| YGL  | YELGUR        |
| YNK  | YELHANKA JN   |
| DLI  | DELHI         |
| DAZ  | DELHI AZADPUR |
+------+---------------+
6 rows in set (0.00 sec)

select * from fts_test where match(name) against ('YEL' in boolean mode);
+------+-------------+
| code | name        |
+------+-------------+
| YGL  | YELGUR      |
| YNK  | YELHANKA JN |
+------+-------------+
2 rows in set (0.01 sec)

select * from fts_test where match(name) against ('WEL' in boolean mode);
+------+------------+
| code | name       |
+------+------------+
| WEL  | WELLINGTON |
+------+------------+
1 row in set (0.00 sec)
[13 Mar 2019 12:57] MySQL Verification Team
Hello Seunguck,

Thank you for the report and test case.
Observed this with 5.7.25 build.

regards,
Umesh
[13 Mar 2019 12:58] MySQL Verification Team
5.7.25 - test results

Attachment: 94640.results (application/octet-stream, text), 11.87 KiB.

[15 Mar 2019 6:41] Seunguck Lee
Sorry Umesh.

It was because of DEFAULT stopword table.
Default stopword table has "de". And this stop word remove "de" token of bi-gram parsed tokens("de" and "el") from "del"

Just ignore this bug request.

mysql> select * from information_schema.INNODB_FT_DEFAULT_STOPWORD;
+-------+
| value |
+-------+
| a     |
| about |
| an    |
| are   |
| as    |
| at    |
| be    |
| by    |
| com   |
| de    |
| en    |
| for   |
| from  |
| how   |
| i     |
| in    |
| is    |
| it    |
| la    |
| of    |
| on    |
| or    |
| that  |
| the   |
| this  |
| to    |
| was   |
| what  |
| when  |
| where |
| who   |
| will  |
| with  |
| und   |
| the   |
| www   |
+-------+
36 rows in set (0.00 sec)
[15 Mar 2019 7:02] MySQL Verification Team
Thank you for confirming, I'll mark this as duplicate of Bug#84420.