Bug #38676 | Full-Text search does not handle apostrophes (before/after a word) as described | ||
---|---|---|---|
Submitted: | 8 Aug 2008 17:38 | Modified: | 8 Aug 2008 19:16 |
Reporter: | Thomas Chin | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: FULLTEXT search | Severity: | S3 (Non-critical) |
Version: | mysql-5.0.44-r1 | OS: | Linux (Gentoo) |
Assigned to: | CPU Architecture: | Any | |
Tags: | apostrophe, fulltext |
[8 Aug 2008 17:38]
Thomas Chin
[8 Aug 2008 18:36]
Sergei Golubchik
note that foo and bar weren't found, probably, because of ft_min_word_length, decrease ft_min_word_length or try with longer words
[8 Aug 2008 18:46]
Thomas Chin
Sorry about that. Our machines are configured to have ft_min_word_length=1, so I did previous tests that was able to location foo and bar (without apostrophes) as part of the test process I did not share. From my example, it does also illustrate how the behavior mutates after a row of a certain type is inserted. Regardless, same test with foo -> apple and bar -> orange: --- mysql> CREATE TABLE test (txt TEXT); Query OK, 0 rows affected (0.06 sec) mysql> CREATE FULLTEXT INDEX txt ON test(txt); Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> INSERT INTO test VALUES('\'apple'); Query OK, 1 row affected (0.00 sec) mysql> INSERT INTO test VALUES('orange\''); Query OK, 1 row affected (0.00 sec) mysql> SELECT * FROM test; +---------+ | txt | +---------+ | 'apple | | orange' | +---------+ 2 rows in set (0.00 sec) mysql> -- expect 1 row (*** unexpected result ***) mysql> SELECT txt FROM test WHERE MATCH(txt) AGAINST('apple'); Empty set (0.00 sec) mysql> -- expect 1 row (*** unexpected result ***) mysql> SELECT txt FROM test WHERE MATCH(txt) AGAINST('orange'); Empty set (0.00 sec) mysql> -- expect empty mysql> SELECT txt FROM test WHERE MATCH(txt) AGAINST('\'apple'); Empty set (0.00 sec) mysql> -- expect empty mysql> SELECT txt FROM test WHERE MATCH(txt) AGAINST('orange\''); Empty set (0.00 sec) mysql> INSERT INTO test VALUES('\'aaa\'bbb\''); Query OK, 1 row affected (0.00 sec) mysql> -- expect 1 row mysql> SELECT txt FROM test WHERE MATCH(txt) AGAINST('aaa\'bbb'); +-----------+ | txt | +-----------+ | 'aaa'bbb' | +-----------+ 1 row in set (0.00 sec) mysql> -- expect 1 row (*** behavior changes from previous attempt ***) mysql> SELECT txt FROM test WHERE MATCH(txt) AGAINST('apple'); +--------+ | txt | +--------+ | 'apple | +--------+ 1 row in set (0.00 sec) mysql> -- expect empty (*** unexpected result ***) mysql> SELECT txt FROM test WHERE MATCH(txt) AGAINST('\'apple'); +--------+ | txt | +--------+ | 'apple | +--------+ 1 row in set (0.00 sec)
[8 Aug 2008 18:56]
Sergei Golubchik
Okay, I see now. It's not a bug, but how relevance ranking works when you search in natural language mode. Check the appropriate section of the manual for details, but in short - a word gets 0 relevance (and thus, is not returned) if it appears in 50% (or more) of the rows. First you had only two rows, and neither was returned. Then you added a third one, and all the three started to appear in search results. Possible workarounds are also explained in the manual.
[8 Aug 2008 19:16]
Thomas Chin
Thanks guys and sorry about that. We were conducting some testing of another issue and in trying to create a very simple test case, we encountered this behavior.