Bug #74042 | Fulltext query including @ symbol fails in boolean mode on innodb | ||
---|---|---|---|
Submitted: | 23 Sep 2014 20:55 | Modified: | 22 Jan 2015 15:37 |
Reporter: | Ike Walker | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S3 (Non-critical) |
Version: | 5.6./5.7 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[23 Sep 2014 20:55]
Ike Walker
[23 Sep 2014 22:34]
MySQL Verification Team
C:\dbs>net start mysqld56 The MySQLD56 service is starting.. The MySQLD56 service was started successfully. C:\dbs>56 C:\dbs>c:\dbs\5.6\bin\mysql -uroot --port=3560 --debug-info --prompt="mysql 5.6 > " Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.6.22 Source distribution Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql 5.6 > USE test Database changed mysql 5.6 > drop table if exists at_symbol_test; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql 5.6 > create table if not exists at_symbol_test ( -> str varchar(255), -> fulltext index fti_index (str) -> ) engine = innodb; Query OK, 0 rows affected (3.49 sec) mysql 5.6 > insert into at_symbol_test values ('aaa bbb ccc'); Query OK, 1 row affected (0.06 sec) mysql 5.6 > insert into at_symbol_test values ('ccc ddd eee'); Query OK, 1 row affected (0.01 sec) mysql 5.6 > insert into at_symbol_test values ('fff ggg hhh'); Query OK, 1 row affected (0.06 sec) mysql 5.6 > insert into at_symbol_test values ('fff ggg@hhh iii'); Query OK, 1 row affected (0.03 sec) mysql 5.6 > insert into at_symbol_test values ('hello C@S world'); Query OK, 1 row affected (0.02 sec) mysql 5.6 > insert into at_symbol_test values ('twitter @mysql'); Query OK, 1 row affected (0.03 sec) mysql 5.6 > insert into at_symbol_test values ('twitter @oracle'); Query OK, 1 row affected (0.03 sec) mysql 5.6 > select * from at_symbol_test where match(str) against ('ggg@hhh' IN BOOLEAN MODE); ERROR 1064 (42000): syntax error, unexpected '@', expecting $end mysql 5.6 > select * from at_symbol_test where match(str) against ('+twitter +@m*' IN BOOLEAN MODE); ERROR 1064 (42000): syntax error, unexpected '@' mysql 5.6 > alter table at_symbol_test engine = myisam; Query OK, 7 rows affected (0.44 sec) Records: 7 Duplicates: 0 Warnings: 0 mysql 5.6 > SHOW CREATE TABLE at_symbol_test\G *************************** 1. row *************************** Table: at_symbol_test Create Table: CREATE TABLE `at_symbol_test` ( `str` varchar(255) DEFAULT NULL, FULLTEXT KEY `fti_index` (`str`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 1 row in set (0.02 sec) mysql 5.6 > select * from at_symbol_test where match(str) against ('ggg@hhh' IN BOOLEAN MODE); Empty set (0.00 sec) mysql 5.6 > select * from at_symbol_test where match(str) against ('+twitter +@m*' IN BOOLEAN MODE); +-----------------+ | str | +-----------------+ | twitter @mysql | | twitter @oracle | +-----------------+ 2 rows in set (0.01 sec) mysql 5.6 >
[23 Sep 2014 22:37]
MySQL Verification Team
mysql 5.7 > -- these queries fail mysql 5.7 > select * from at_symbol_test where match(str) against ('ggg@hhh' IN BOOLEAN MODE); ERROR 1064 (42000): syntax error, unexpected '@', expecting $end mysql 5.7 > select * from at_symbol_test where match(str) against ('+twitter +@m*' IN BOOLEAN MODE); ERROR 1064 (42000): syntax error, unexpected '@' mysql 5.7 > SHOW VARIABLES LIKE "%VERSION%"; +-------------------------+---------------------+ | Variable_name | Value | +-------------------------+---------------------+ | innodb_version | 5.7.6 | | protocol_version | 10 | | slave_type_conversions | | | version | 5.7.6-m16 | | version_comment | Source distribution | | version_compile_machine | x86_64 | | version_compile_os | Win64 | +-------------------------+---------------------+ 7 rows in set (0.00 sec) mysql 5.7 >
[23 Sep 2014 22:53]
MySQL Verification Team
Thank you for the bug report.
[23 Sep 2014 22:58]
MySQL Verification Team
Please see http://bugs.mysql.com/bug.php?id=72605.
[23 Sep 2014 23:07]
Ike Walker
Thanks for the link. I see that @ is used to search for words a certain distance apart in InnoDB: "@distance This operator works on InnoDB tables only. It tests whether two or more words all start within a specified distance from each other, measured in words. Specify the search words within a double-quoted string immediately before the @distance operator, for example, MATCH(col1) AGAINST('"word1 word2 word3" @8' IN BOOLEAN MODE)"
[23 Sep 2014 23:08]
Ike Walker
Is it possible to search for a literal "@" character using FULLTEXT search in InnoDB IN BOOLEAN MODE?
[23 Dec 2014 20:48]
Sveta Smirnova
Actually it is not reliable to search for "@" symbol with MyISAM too. mysql> drop table if exists at_symbol_test; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> create table if not exists at_symbol_test ( -> str varchar(255), -> fulltext index fti_index (str) -> ) engine = myisam; Query OK, 0 rows affected (0.01 sec) mysql> insert into at_symbol_test values ('twitter @mysql'); Query OK, 1 row affected (0.00 sec) mysql> insert into at_symbol_test values ('twitter @oracle'); Query OK, 1 row affected (0.00 sec) mysql> insert into at_symbol_test values ('twitter mysql'); Query OK, 1 row affected (0.00 sec) mysql> insert into at_symbol_test values ('twitter oracle'); Query OK, 1 row affected (0.00 sec) mysql> select * from at_symbol_test where match(str) against ('+twitter +@*' IN BOOLEAN MODE); +-----------------+ | str | +-----------------+ | twitter @mysql | | twitter @oracle | | twitter mysql | | twitter oracle | +-----------------+ 4 rows in set (0.00 sec) mysql> select * from at_symbol_test where match(str) against ('+twitter +@m*' IN BOOLEAN MODE); +-----------------+ | str | +-----------------+ | twitter @mysql | | twitter @oracle | | twitter mysql | | twitter oracle | +-----------------+ 4 rows in set (0.00 sec) You see: all rows returned, independently if they have "@" sign or not.
[22 Jan 2015 15:37]
Daniel Price
Posted by developer: Added the following statement to: http://dev.mysql.com/doc/refman/5.6/en/fulltext-boolean.html http://dev.mysql.com/doc/refman/5.7/en/fulltext-boolean.html "InnoDB full-text search does not support the use of the "@" symbol in boolean full-text searches. The "@" symbol is reserved for use by the "@distance" proximity search operator."
[3 Jan 2018 16:57]
teo teo
This is ridiculous! If some character has a special meaning, then a way MUST be provided to escape it. That is an obvious basic requirement in ANY language. And there's no way to escape a "@" in a match..against boolean query, or if there is, it's not documented. That IS a bug. This bug should be reopened.
[28 Nov 2018 14:00]
Ole Martin Handeland
I originally agreed completely with teo teo above, but after some further research I see that escaping the @ character does not make sense. The fulltext index operates on words, and (I'm guessing) will not index the @ character as it is not considered a word character. This is the default setting at least [1], and if the @ character isn't ever in the index it doesn't make sense to do a literal search for it either. I've replaced the @ character (and all other non-word characters) in our search input with spaces now, as there's no point searching for something that's not in the index. This will of course look strange when searching for email@example.org and getting results for something that contained those words, but not in that order (as the search string will be converted to "email example org"), but quoting the search string will produce results that look more like what the user is looking for.
[28 Nov 2018 23:37]
teo teo
> I see that escaping the @ character does not make sense. That's not true. It may not yield results, but that doesn't mean it doesn't make sense. > The fulltext index operates on words, and (I'm guessing) will not index the @ character as it is not considered a word character. This is the default setting at least [1], That default can be changed. https://dev.mysql.com/doc/refman/8.0/en/fulltext-fine-tuning.html#fulltext-modify-characte... The set of characters that are considered word characters can be changed. And if "@" can't be added, then that's another bug (or wrong design decision). You should be allowed to have "@" be considered a word character, and hence to index words containing it, and hence to search for words containing it, by escaping it. > and if the @ character isn't ever in the index it doesn't make sense to do a literal search for it either And even in that case, you should be allowed to escape it and get a valid query that doesn't get any result but doesn't produce a syntax error either.