Bug #1506 ' is not seen as a spacing character in a full text search
Submitted: 9 Oct 2003 4:47 Modified: 12 Oct 2003 11:03
Reporter: Patrick LIENART Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S3 (Non-critical)
Version:4.0.11a OS:Linux (Linux 2.4.21-0.13mdk)
Assigned to: CPU Architecture:Any

[9 Oct 2003 4:47] Patrick LIENART
Description:
A full text search whith criteria "AGAINST('huile')" don't return records containing words "l'huile" or "d'huile". "'" should be considered as spacing character (and so it is configured in the ctype array of Latin1 used character set).

How to repeat:
CREATE TABLE test (text TEXT NOT NULL, FULLTEXT (text));
INSERT INTO test VALUES ("huile"),("cuisine à l'huile"),("bouteille d'huile");
SELECT * FROM test where MATCH(text) AGAINST("huile");

Result : 
+-------+
| text  |
+-------+
| huile |
+-------+
1 row in set (0.00 sec)

Suggested fix:
Modify full text index genaration to handle (ctype) spacing characters.
[12 Oct 2003 11:03] Sergei Golubchik
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.mysql.com/documentation/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

according to the manual:

     A "word" is any sequence of characters consisting of letters, digits, "'", and "_"

Thus, it's not a bug but a deliberate design desicion.
You can change it, though, by modifying misc_word_char() macro in ft_parser.c file and recompiling MySQL.
[20 Oct 2003 9:46] Patrick LIENART
Sergei,

Thanks for the advice. I modified misc_word_char() , recompiled whith a 4.0.15 tarball and it's working fine. In french, "'" is really a spacing character indeed.