Bug #37888 fulltext boolean mode stop words in quotes with and operator doesn't work
Submitted: 4 Jul 2008 17:35 Modified: 4 Jul 2008 18:18
Reporter: Douglass Davis Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: FULLTEXT search Severity:S3 (Non-critical)
Version:14.7 Distrib 4.1.22, for Win32 (ia32) OS:Windows
Assigned to: CPU Architecture:Any
Tags: fulltext stopword boolean mode

[4 Jul 2008 17:35] Douglass Davis
Description:
When searching fulltext in boolean mode, if a stopword is used within quotes, with the and operator, this causes no rows to be matched, no matter what the rest of the query is.

How to repeat:
CREATE TABLE `content_index` (
  `docid` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `page_text` longtext,
  PRIMARY KEY  (`docid`),
  FULLTEXT KEY `FullTextIdx` (`page_text`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

insert into content_index (page_text) VALUES ('Reset your password')
insert into content_index (page_text) VALUES ('sign in using your username and password')

The problem: This returns no rows:
SELECT * FROM content_index  WHERE match(page_text) against(' password +"your"' IN BOOLEAN MODE )

Even though all of the following return 2 rows as expected:
SELECT * FROM content_index  WHERE match(page_text) against(' password "your" ' IN BOOLEAN MODE )
SELECT * FROM content_index  WHERE match(page_text) against(' password your' IN BOOLEAN MODE )
SELECT * FROM content_index  WHERE match(page_text) against(' password +your' IN BOOLEAN MODE )

Suggested fix:
Even if the stopword is in quotes with a + operator, it should still be treated the same way as if it was not in quotes.  The stop word should be ignored and the rest of the words should be searched for.
[4 Jul 2008 18:18] Sveta Smirnova
Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://dev.mysql.com/doc/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

According to http://dev.mysql.com/doc/refman/4.1/en/fulltext-boolean.html:

# "

A phrase that is enclosed within double quote (“"”) characters matches only rows that contain the phrase literally, as it was typed. The full-text engine splits the phrase into words, performs a search in the FULLTEXT index for the words. The engine then performs a substring search for the phrase in the records that are found, so the match must include non-word characters in the phrase. For example, "test phrase" does not match "test, phrase".

If the phrase contains no words that are in the index, the result is empty. For example, if all words are either stopwords or shorter than the minimum length of indexed words, the result is empty.

and

# +

A leading plus sign indicates that this word *must* be present in each row that is returned.

So in first case you ask for word 'password' and result in quotes must be required. Result in quotes calculated to empty.

For following case result in quotes is not required.

So I close the report as "Not a Bug"