Bug #64131 Full-text searching
Submitted: 25 Jan 2012 23:15 Modified: 17 Feb 2012 17:30
Reporter: Edward Leigh Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: FULLTEXT search Severity:S2 (Serious)
Version:5.1.56-log OS:FreeBSD (v8.2)
Assigned to: CPU Architecture:Any
Tags: boolean mode, exact match, full-text, fulltext, stop words, stopwords

[25 Jan 2012 23:15] Edward Leigh
Description:
Full-text searches in boolean mode are returning no results for exact matches that include a stopword even where the phrase includes at least one non-stopword.

We are using a custom stopword list and ft_min_word_len = 1

This bug was not apparent in v.5.0.x and does not appear to have been fixed in later 5.1.x updates.

How to repeat:
The table 'assets' contains a full-text index:
FULLTEXT (keywords, title, caption, credit, location)

This returns no results:

SELECT a.* FROM assets AS a WHERE MATCH (a.keywords, a.title, a.caption, a.credit, a.location) AGAINST ('"history of rome"' IN BOOLEAN MODE);

Take out one of the fields (so the full-text index is not used), and results are found:

SELECT a.* FROM assets AS a WHERE MATCH (a.keywords, a.title, a.caption, a.credit) AGAINST ('"history of rome"' IN BOOLEAN MODE);

Suggested fix:
Ignore stopwords in exact match strings.
[26 Jan 2012 0:45] Edward Leigh
This also returns no results:

SELECT a.* FROM assets AS a WHERE MATCH (a.keywords, a.title, a.caption, a.credit, a.location) AGAINST ('+history +of +rome' IN BOOLEAN MODE);

But this does return results:

SELECT a.* FROM assets AS a WHERE MATCH (a.keywords, a.title, a.caption, a.credit, a.location) AGAINST ('+history of +rome' IN BOOLEAN MODE);
[26 Jan 2012 17:30] Valeriy Kravchuk
Please, send exact CREATE TABLE and small dataset for it to demonstrate the problem.
[26 Jan 2012 19:36] Edward Leigh
I created a test table and couldn't reproduce the problem. I had previously done a 'REPAIR TABLE tbl_name QUICK;' on the problem table, but this time I dropped the index and rebuilt it. The problem went away. So the bug isn't as reported, but rather an obscure corruption of the full-text index that was not detected by REPAIR TABLE or myisamchk.

For reference, this is an extract from the CREATE TABLE for the table in question:

CREATE TABLE `assets` (
  `assetID` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `title` varchar(255) NOT NULL DEFAULT '',
  `credit` varchar(255) NOT NULL DEFAULT '',
  `location` varchar(255) NOT NULL DEFAULT '',
  `caption` text NOT NULL,
  `keywords` text NOT NULL,
  PRIMARY KEY (`assetID`),
  FULLTEXT KEY `general` (`keywords`,`title`,`caption`,`credit`,`location`)
) ENGINE=MyISAM AUTO_INCREMENT=340342 DEFAULT CHARSET=utf8
[27 Jan 2012 8:09] Valeriy Kravchuk
Do you have a copy of *.MYI and *.MYD files for the corrupted table, by chance?
[27 Jan 2012 13:30] Edward Leigh
Unfortunately I didn't make a backup of the data files before I rebuilt the index.
[17 Feb 2012 17:30] Sveta Smirnova
Thank you for the feedback.

In this case something wrong with the table. I close the report as "Can't repeat" for now. If you meet same issue again save table files, then reopen the report.