Bug #13747 Full text search depends on number of rows
Submitted: 4 Oct 2005 16:35 Modified: 4 Oct 2005 17:09
Reporter: Lars Beuster Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.0.13-rc OS:Windows (W2K)
Assigned to: MySQL Verification Team CPU Architecture:Any

[4 Oct 2005 16:35] Lars Beuster
Description:
The result of a simple full text search depends on the number of rows even if the additional rows doesn't belong to the search result.

Example, better to understand :-)

- I insert a row that should be found by a simple match()
-> not found
- I insert another row that should *not* be found by my match()
-> the *first* row is found (as expected)

I have the default installation on Windows 2000 with the exe-installer.

How to repeat:
drop table TEST10;
CREATE TABLE TEST10 (TITLE TEXT, FULLTEXT(TITLE)) engine=MyISAM;

insert into TEST10 (TITLE) values ('searchstring');
INSERT INTO TEST10 (title) VALUES ('test1');

-- I should find one entry - but that's not the case
-- "Empty set (0.00 sec)"
SELECT * FROM TEST10 WHERE MATCH (TITLE) AGAINST ('searchstring');

-- after inserting this row (that has nothing to do with the search query)
-- I find my searched entry
-- "1 row in set (0.00 sec)"
INSERT INTO TEST10 (title) VALUES ('test2');
SELECT * FROM TEST10 WHERE MATCH (TITLE) AGAINST ('searchstring');
[4 Oct 2005 17:09] MySQL Verification Team
That behavior is explained on the Manual:

http://dev.mysql.com/doc/mysql/en/fulltext-boolean.html

e.g: 50% threshold.

the sample you showed is a special case of fulltext search in NL
mode uses 'global' statistics to calculate relevance, so even
you add unrelated row to the table, the statistics changes
e.g: 'number of rows' is a part of the formula.
[5 Oct 2005 8:10] Lars Beuster
Thanks for your reply - and sorry for the "bug". 

But I think the manual should better stress this fact. Even now when I knew about the 50%, it wasn't easy to find something about it. Especially when you start working with full text search you add one line and search, add another line and search again.

Thanks again
Lars