Bug #88707 Need tokenize function for making a query in detail
Submitted: 30 Nov 2017 6:08 Modified: 1 Dec 2017 8:17
Reporter: Meiji Kimura Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: FULLTEXT search Severity:S4 (Feature request)
Version:5.7, 8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: MeCab

[30 Nov 2017 6:08] Meiji Kimura
Description:
fulltext boolean mode support various operators (e.g. + , -, etc)

https://dev.mysql.com/doc/refman/5.7/en/fulltext-boolean.html

In English, it's easy for split a sentence(and terms) into words.
so easy to specify complicated condition.

'+apple -macintosh'

Buf In CJK(Chinese, Japanese, Korean), it's difficult to split a sentence into words from 'データベース管理' to 'データベース' and '管理';

So I want to add new function 'tokenize' for helping to make complicated query confition for CJK in string function.
https://dev.mysql.com/doc/refman/5.7/en/string-functions.html

TOKENIZE(tokenizer, str)

 Returns the string str tokenized by tokenizer with sepalator ' '.

mysql> SELECT TOKENIZE('mecab','データベース管理');
        -> 'データベース 管理'

The user can use a result of this function to make complicated conditions for boolean query, e.g. '+データベース +管理'.

How to repeat:
See 'Description'.

Suggested fix:
Make TOKENIZE or equivalent fucntion.

TOKENIZE(tokenizer, str)

 Returns the string str tokenized by tokenizer with sepalator ' '.

mysql> SELECT TOKENIZE('mecab','データベース管理');
        -> 'データベース 管理'
[1 Dec 2017 8:17] MySQL Verification Team
Hello Meiji-San,

Thank you for the feature request!

Thanks,
Umesh