Description:
ngram FT parser and MeCab FT parser, which was introduced 5.7.6-m16, is described as "InnoDB FullText Parser" in doc and relnote.
MySQL :: MySQL 5.7 Reference Manual :: 12.9.9 InnoDB MeCab Full-Text Parser Plugin https://dev.mysql.com/doc/refman/5.7/en/fulltext-search-mecab.html
MySQL :: MySQL 5.7 Reference Manual :: 12.9.8 InnoDB ngram Full-Text Parser https://dev.mysql.com/doc/refman/5.7/en/fulltext-search-ngram.html
MySQL :: MySQL 5.7 Release Notes :: Changes in MySQL 5.7.6 (2015-03-09, Milestone 16) http://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-6.html
But, actually, they can be set for MyISAM table.
Is it a bug or a shortage of doc?
How to repeat:
* ngram
mysql57> create table t1 (num int, val varchar(32)) Engine = MyISAM;
Query OK, 0 rows affected (0.02 sec)
mysql57> INSERT INTO t1 VALUES (1, 'one two three, いちにいさん');
Query OK, 1 row affected (0.01 sec)
mysql57> SELECT @@ngram_token_size;
+--------------------+
| @@ngram_token_size |
+--------------------+
| 2 |
+--------------------+
1 row in set (0.00 sec)
mysql57> ALTER TABLE t1 ADD FULLTEXT KEY (val) WITH PARSER ngram;
Query OK, 1 row affected (0.02 sec)
Records: 1 Duplicates: 0 Warnings: 0
$ /usr/mysql/5.7.6/bin/myisam_ftdump -d t1 0
0 0.8613265 e,
0 0.8613265 ee
0 0.8613265 hr
0 0.8613265 ne
0 0.8613265 on
0 0.8613265 re
0 0.8613265 th
0 0.8613265 tw
0 0.8613265 wo
0 0.8613265 いさ
0 0.8613265 いち
0 0.8613265 さん
0 0.8613265 ちに
0 0.8613265 にい
That's well parsed by 2-gram parser.
* MeCab
mysql57> create table t1 (num int, val varchar(32)) Engine = MyISAM;
Query OK, 0 rows affected (0.02 sec)
mysql57> INSERT INTO t1 VALUES (1, 'one two three, いちにいさん');
Query OK, 1 row affected (0.01 sec)
mysql57> ALTER TABLE t1 ADD FULLTEXT KEY(val) WITH PARSER mecab;
Query OK, 1 row affected (0.16 sec)
Records: 1 Duplicates: 0 Warnings: 0
$ /usr/mysql/5.7.6/bin/myisam_ftdump -d t1 0
0 1.4258175
0 0.8421108 ,
0 0.8421108 one
0 0.8421108 three
0 0.8421108 two
0 0.8421108 いち
0 0.8421108 にいさん
$ echo 'one two three, いちにいさん' | mecab
one 名詞,固有名詞,組織,*,*,*,*
two 名詞,一般,*,*,*,*,*
three 名詞,一般,*,*,*,*,*
, 名詞,サ変接続,*,*,*,*,*
いち 名詞,一般,*,*,*,*,いち,イチ,イチ
にいさん 名詞,一般,*,*,*,*,にいさん,ニイサン,ニーサン
EOS
Correctly parsed by MeCab, mysqm_ftdump's result is same as mecab command's one.
Suggested fix:
Is it a bug or a shortage of doc?