Bug #80432 | No results in fulltext search for top level domain in domain part of email | ||
---|---|---|---|
Submitted: | 18 Feb 2016 15:58 | Modified: | 16 Mar 2016 6:15 |
Reporter: | Mev-Rael | Email Updates: | |
Status: | Won't fix | Impact on me: | |
Category: | MySQL Server: FULLTEXT search | Severity: | S1 (Critical) |
Version: | 5.7.10 | OS: | Windows (InnoDB, utf8mb4_unicode_ci) |
Assigned to: | CPU Architecture: | Any | |
Tags: | domain, email, escape, fulltext, statement |
[18 Feb 2016 15:58]
Mev-Rael
[18 Feb 2016 19:04]
MySQL Verification Team
Thank you for the bug report. C:\dbs>c:\dbs\5.7\bin\mysql -uroot -p --port=3570 --prompt="mysql 5.7 > " Enter password: ****** Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 4 Server version: 5.7.12 Source distribution PULL: 2016-FEB-06 Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql 5.7 > use test Database changed mysql 5.7 > CREATE TABLE `users` ( -> `id` int(10) unsigned NOT NULL AUTO_INCREMENT, -> `first_name` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL, -> `last_name` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL, -> `email` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL, -> PRIMARY KEY (`id`), -> UNIQUE KEY `users_email_unique` (`email`), -> FULLTEXT KEY `first_last_name_email_fulltext` (`email`,`last_name`,`first_name`) -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; Query OK, 0 rows affected (2.30 sec) mysql 5.7 > mysql 5.7 > INSERT INTO `users` (`first_name`, `last_name`, `email`) VALUES ('John', 'Smith', 'foo.bar@domain.com'); Query OK, 1 row affected (0.03 sec) mysql 5.7 > select * from `users` where MATCH (email, last_name, first_name) AGAINST ('+com*' IN BOOLEAN MODE); Empty set (0.09 sec) mysql 5.7 > select * from `users` where MATCH (email, last_name, first_name) AGAINST ('+bar*' IN BOOLEAN MODE); +----+------------+-----------+--------------------+ | id | first_name | last_name | email | +----+------------+-----------+--------------------+ | 1 | John | Smith | foo.bar@domain.com | +----+------------+-----------+--------------------+ 1 row in set (0.00 sec) mysql 5.7 > select * from `users` where MATCH (email, last_name, first_name) AGAINST ('+domain*' IN BOOLEAN MODE); +----+------------+-----------+--------------------+ | id | first_name | last_name | email | +----+------------+-----------+--------------------+ | 1 | John | Smith | foo.bar@domain.com | +----+------------+-----------+--------------------+ 1 row in set (0.00 sec) mysql 5.7 > select * from `users` where MATCH (email, last_name, first_name) AGAINST ('+domain.co*' IN BOOLEAN MODE); +----+------------+-----------+--------------------+ | id | first_name | last_name | email | +----+------------+-----------+--------------------+ | 1 | John | Smith | foo.bar@domain.com | +----+------------+-----------+--------------------+ 1 row in set (0.00 sec) mysql 5.7 > select * from `users` where MATCH (email, last_name, first_name) AGAINST ('+domain.com*' IN BOOLEAN MODE); Empty set (0.00 sec) mysql 5.7 > alter table users engine MyISAM; Query OK, 1 row affected (0.25 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql 5.7 > select * from `users` where MATCH (email, last_name, first_name) AGAINST ('+domain.com*' IN BOOLEAN MODE); +----+------------+-----------+--------------------+ | id | first_name | last_name | email | +----+------------+-----------+--------------------+ | 1 | John | Smith | foo.bar@domain.com | +----+------------+-----------+--------------------+ 1 row in set (0.02 sec) mysql 5.7 > show create table users; +-------+------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------+ | users | CREATE TABLE `users` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `first_name` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL, `last_name` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL, `email` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `users_email_unique` (`email`), FULLTEXT KEY `first_last_name_email_fulltext` (`email`,`last_name`,`first_name`) ) ENGINE=MyISAM AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci | +-------+------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.03 sec) mysql 5.7 >
[4 Mar 2016 11:35]
ADITYA ANANTHAPADMANABHA
Posted by developer: Actually this is not a bug because "com" word is present in the stop word list .Please refer to http://dev.mysql.com/doc/refman/5.7/en/fulltext-stopwords.html I ran the test with removing "com" as one of the stop words. mysql> select * from `users` where MATCH (email, last_name, first_name) AGAINST ('+com*' IN BOOLEAN MODE); +----+------------+-----------+--------------------+ | id | first_name | last_name | email | +----+------------+-----------+--------------------+ | 1 | John | Smith | foo.bar@domain.com | +----+------------+-----------+--------------------+ 1 row in set (4.76 sec) mysql> select * from `users` where MATCH (email, last_name, first_name) AGAINST -> ('+bar*' IN BOOLEAN MODE); +----+------------+-----------+--------------------+ | id | first_name | last_name | email | +----+------------+-----------+--------------------+ | 1 | John | Smith | foo.bar@domain.com | +----+------------+-----------+--------------------+ 1 row in set (3.15 sec) mysql> select * from `users` where MATCH (email, last_name, first_name) AGAINST -> ('+domain*' IN BOOLEAN MODE); +----+------------+-----------+--------------------+ | id | first_name | last_name | email | +----+------------+-----------+--------------------+ | 1 | John | Smith | foo.bar@domain.com | +----+------------+-----------+--------------------+ mysql> select * from `users` where MATCH (email, last_name, first_name) AGAINST -> ('+domain.co*' IN BOOLEAN MODE); +----+------------+-----------+--------------------+ | id | first_name | last_name | email | +----+------------+-----------+--------------------+ | 1 | John | Smith | foo.bar@domain.com | +----+------------+-----------+--------------------+ mysql> select * from `users` where MATCH (email, last_name, first_name) AGAINST -> ('+domain.com*' IN BOOLEAN MODE); +----+------------+-----------+--------------------+ | id | first_name | last_name | email | +----+------------+-----------+--------------------+ | 1 | John | Smith | foo.bar@domain.com | +----+------------+-----------+--------------------+ 1 row in set (5.17 sec) After removing from the stop word list "domain" and "com" are treated like different words. Referring to your other question on w hy we cannot use "@" in searching pattern please refer http://clustra.no.oracle.com/orabugs/bug.php?id=19673945 In light of these i am closing this as "not a bug".
[8 Mar 2016 11:37]
ADITYA ANANTHAPADMANABHA
Posted by developer: the external bug number for above posted internal bug id is http://bugs.mysql.com/bug.php?id=74042
[9 Mar 2016 20:36]
Mev-Rael
I see, it's realy stopwords, however, for small data like users table I would like to remove stopwords and keep for text containing tables. Is it possible to set different stopwords for different tables or any sql query flag/option to ignore stop words only in specifig queries? Something like SELECT SQL_CACHE id, name FROM customer?
[16 Mar 2016 6:15]
Erlend Dahl
Posted by developer: [15 Mar 2016 22:27] Aditya A Though you cannot specify in the query to ignore stopwords , there is a easy workaround if you want different stop words for different tables. The stop word table is taken into consideration while creating the fts index,so in case of your example set innodb_ft_server_stopword_table = 'test/stop_word_table_for_users' and then create user table and then set innodb_ft_server_stopword_table = 'test/stop_word_table_for_text_tables' and then create text tables. once created they no longer depend upon where innodb_ft_server_stopword_table is pointing to ,they always follow the stop word table they where created with This will survive a restart too