| Bug #38676 | Full-Text search does not handle apostrophes (before/after a word) as described | ||
|---|---|---|---|
| Submitted: | 8 Aug 2008 17:38 | Modified: | 8 Aug 2008 19:16 |
| Reporter: | Thomas Chin | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server: FULLTEXT search | Severity: | S3 (Non-critical) |
| Version: | mysql-5.0.44-r1 | OS: | Linux (Gentoo) |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | apostrophe, fulltext | ||
[8 Aug 2008 17:38]
Thomas Chin
[8 Aug 2008 18:36]
Sergei Golubchik
note that foo and bar weren't found, probably, because of ft_min_word_length, decrease ft_min_word_length or try with longer words
[8 Aug 2008 18:46]
Thomas Chin
Sorry about that. Our machines are configured to have ft_min_word_length=1, so I did previous tests that was able to location foo and bar (without apostrophes) as part of the test process I did not share. From my example, it does also illustrate how the behavior mutates after a row of a certain type is inserted.
Regardless, same test with foo -> apple and bar -> orange:
---
mysql> CREATE TABLE test (txt TEXT);
Query OK, 0 rows affected (0.06 sec)
mysql> CREATE FULLTEXT INDEX txt ON test(txt);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> INSERT INTO test VALUES('\'apple');
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO test VALUES('orange\'');
Query OK, 1 row affected (0.00 sec)
mysql> SELECT * FROM test;
+---------+
| txt |
+---------+
| 'apple |
| orange' |
+---------+
2 rows in set (0.00 sec)
mysql> -- expect 1 row (*** unexpected result ***)
mysql> SELECT txt FROM test WHERE MATCH(txt) AGAINST('apple');
Empty set (0.00 sec)
mysql> -- expect 1 row (*** unexpected result ***)
mysql> SELECT txt FROM test WHERE MATCH(txt) AGAINST('orange');
Empty set (0.00 sec)
mysql> -- expect empty
mysql> SELECT txt FROM test WHERE MATCH(txt) AGAINST('\'apple');
Empty set (0.00 sec)
mysql> -- expect empty
mysql> SELECT txt FROM test WHERE MATCH(txt) AGAINST('orange\'');
Empty set (0.00 sec)
mysql> INSERT INTO test VALUES('\'aaa\'bbb\'');
Query OK, 1 row affected (0.00 sec)
mysql> -- expect 1 row
mysql> SELECT txt FROM test WHERE MATCH(txt) AGAINST('aaa\'bbb');
+-----------+
| txt |
+-----------+
| 'aaa'bbb' |
+-----------+
1 row in set (0.00 sec)
mysql> -- expect 1 row (*** behavior changes from previous attempt ***)
mysql> SELECT txt FROM test WHERE MATCH(txt) AGAINST('apple');
+--------+
| txt |
+--------+
| 'apple |
+--------+
1 row in set (0.00 sec)
mysql> -- expect empty (*** unexpected result ***)
mysql> SELECT txt FROM test WHERE MATCH(txt) AGAINST('\'apple');
+--------+
| txt |
+--------+
| 'apple |
+--------+
1 row in set (0.00 sec)
[8 Aug 2008 18:56]
Sergei Golubchik
Okay, I see now. It's not a bug, but how relevance ranking works when you search in natural language mode. Check the appropriate section of the manual for details, but in short - a word gets 0 relevance (and thus, is not returned) if it appears in 50% (or more) of the rows. First you had only two rows, and neither was returned. Then you added a third one, and all the three started to appear in search results. Possible workarounds are also explained in the manual.
[8 Aug 2008 19:16]
Thomas Chin
Thanks guys and sorry about that. We were conducting some testing of another issue and in trying to create a very simple test case, we encountered this behavior.
