Bug #14194 | Problem with fulltext boolean search and apostrophe | ||
---|---|---|---|
Submitted: | 20 Oct 2005 21:09 | Modified: | 17 Jan 2006 23:49 |
Reporter: | Jean-Jacques ANDRE | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | 4.1.11 | OS: | Windows (Windows) |
Assigned to: | Sergey Vojtovich | CPU Architecture: | Any |
[20 Oct 2005 21:09]
Jean-Jacques ANDRE
[20 Oct 2005 21:50]
MySQL Verification Team
Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.mysql.com/documentation/ and the instructions on how to report a bug at http://bugs.mysql.com/how-to-report.php Additional info: http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html The MySQL FULLTEXT implementation regards any sequence of true word characters (letters, digits, and underscores) as a word. That sequence may also contain apostrophes ('), but not more than one in a row. This means that aaa'bbb is regarded as one word, but aaa''bbb is regarded as two words. Apostrophes at the beginning or the end of a word are stripped by the FULLTEXT parser; 'aaa'bbb' would be parsed as aaa'bbb. So l'ouragan is regarded as one word and not just ouragan.
[21 Oct 2005 6:43]
Jean-Jacques ANDRE
I definitely think that it is a BUG from a linguistic standpoint. Take the sentence : "I smashed Jerry's car" , you will not be able to find "Jerry" in the text unless you KNOW that it followed by the apostrophe. I do not see the point of looking at that as a feature; fulltext search becomes almost useless in languages making heavy use of the apostrophe (Hey I am french!). Every french word that starts with a vowel is a candidate for not being searchable. I found tons of forum messages on the web of puzzled italians, french,etc... Even in english, it is a problem, come on! Solution: If people want to search -Jerry's- in the text, they should enclose it in double-quotes as stated in the doc -> "Jerry's". Doesn't that seem logical?
[21 Oct 2005 11:55]
Sergei Golubchik
This change cannot be done in 4.1, but we'll consider doing it in 5.0
[22 Oct 2005 10:11]
Jean-Jacques ANDRE
Fixing that in 5.0 would be great and a good reason to upgrade to that version for a number of latin languages developers
[13 Jan 2006 17:40]
Sergey Vojtovich
This bug cannot be fixed in 5.0, as 5.0 is now stable. Will be fixed in 5.1.
[13 Jan 2006 19:34]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/1056
[16 Jan 2006 19:37]
Sergey Vojtovich
Fixed in 5.1.6.
[16 Jan 2006 22:17]
Mike Hillyer
Added to 5.1.6 changelog: <listitem> <para> Words with apostrophes are now matched in a FULLTEXT search against non-apostrophe words (i.e. a search for <literal>Jerry</literal> will match against the term <literal>Jerry's</literal>). (Bug #14194) </para> </listitem>
[17 Jan 2006 10:51]
Jean-Jacques ANDRE
Thanks guys for the 5.1.6 fix! One more reason to upgrade. I think not only me will appreciate that. JJ
[17 Jan 2006 10:57]
Sergey Vojtovich
Mike, I'm sorry, but I forgot one thing that deserves to be mentioned in changelog. Upgrade to version, that include this changeset, will most likely result in broken fulltext index, since parsing rules has been changed. REPAIR TABLE statement should be issued for tables having fulltext indexes. jj a: you're always welcome! :)
[17 Jan 2006 23:49]
Mike Hillyer
REPAIR TABLE warning added to changelog entry.