Bug #47930 MATCH IN BOOLEAN MODE returns too many results inside subquery
Submitted: 9 Oct 2009 7:50 Modified: 12 Mar 2010 17:17
Reporter: James Fryer Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: FULLTEXT search Severity:S3 (Non-critical)
Version:5.0.77, 5.1.31, 5.1.41, 6.0.14 OS:Linux
Assigned to: Georgi Kodinov
Tags: fulltext, subquery
Triage: Triaged: D2 (Serious)

[9 Oct 2009 7:50] James Fryer
Description:
Queries containing a NOT EXISTS subquery with a Boolean fulltext condition return results which should be excluded. 

Removing the fulltext search (replacing with LIKE) or removing/ignoring the fulltext index returns the correct results.

How to repeat:
Import the attached file.

These two queries return different results on 5.0/5.1. I see no reason why they should not return the same results. The query that uses the fulltext index returns results which should be excluded.

1. 
SELECT count(*) FROM LbcTitle t WHERE MATCH(t.title) AGAINST('+miners +strike' IN BOOLEAN MODE) AND not exists(SELECT * FROM Keyword kw JOIN TitleKeyword sk ON kw.id=sk.keyword_id WHERE sk.title_id=t.id AND MATCH(kw.title) AGAINST('scargill' IN BOOLEAN MODE)) order by t.date

2. 
SELECT count(*) FROM LbcTitle t WHERE MATCH(t.title) AGAINST('+miners +strike' IN BOOLEAN MODE) AND not exists(SELECT * FROM Keyword kw IGNORE INDEX (title_2) JOIN TitleKeyword sk ON kw.id=sk.keyword_id WHERE sk.title_id=t.id AND MATCH(kw.title) AGAINST('scargill' IN BOOLEAN MODE)) order by t.date
[9 Oct 2009 7:51] James Fryer
Sample data file for fulltext subquery bug

Attachment: fulltext_subquery_prob.sql (application/octet-stream, text), 5.17 KiB.

[9 Oct 2009 7:52] Valerii Kravchuk
I do not see any file attached, yet...
[9 Oct 2009 22:08] James Fryer
It is there now.
[10 Oct 2009 12:40] Valerii Kravchuk
Verified just as described, also with recent 5.1 and mysql-6.0-codebase from bzr:

77-52-242-160:6.0-codebase openxs$ bin/mysql -uroot test < ~/Downloads/fulltext_subquery_prob.sql 
77-52-242-160:6.0-codebase openxs$ bin/mysql -uroot testReading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 6.0.14-alpha-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> SELECT count(*) FROM LbcTitle t WHERE MATCH(t.title) AGAINST('+miners +strike' IN BOOLEAN MODE) AND not exists(SELECT * FROM Keyword kw JOIN TitleKeyword sk ON kw.id=sk.keyword_id WHERE sk.title_id=t.id AND MATCH(kw.title) AGAINST('scargill' IN BOOLEAN MODE)) order by t.date;
+----------+
| count(*) |
+----------+
|       21 |
+----------+
1 row in set (0.00 sec)

mysql> SELECT count(*) FROM LbcTitle t WHERE MATCH(t.title) AGAINST('+miners +strike' IN BOOLEAN
    -> MODE) AND not exists(SELECT * FROM Keyword kw IGNORE INDEX (title_2) JOIN TitleKeyword sk
    -> ON kw.id=sk.keyword_id WHERE sk.title_id=t.id AND MATCH(kw.title) AGAINST('scargill' IN
    -> BOOLEAN MODE)) order by t.date;
+----------+
| count(*) |
+----------+
|       19 |
+----------+
1 row in set (0.00 sec)

mysql> explain SELECT count(*) FROM LbcTitle t WHERE MATCH(t.title) AGAINST('+miners +strike' IN BOOLEAN MODE) AND not exists(SELECT * FROM Keyword kw IGNORE INDEX (title_2) JOIN TitleKeyword sk ON kw.id=sk.keyword_id WHERE sk.title_id=t.id AND MATCH(kw.title) AGAINST('scargill' IN BOOLEAN MODE)) order by t.date\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: t
         type: fulltext
possible_keys: title
          key: title
      key_len: 0
          ref: 
         rows: 1
        Extra: Using where
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: sk
         type: ref
possible_keys: PRIMARY,keyword_id
          key: PRIMARY
      key_len: 4
          ref: test.t.id
         rows: 1
        Extra: Using index
*************************** 3. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: kw
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.sk.keyword_id
         rows: 1
        Extra: Using where
3 rows in set (0.00 sec)

mysql> explain SELECT count(*) FROM LbcTitle t WHERE MATCH(t.title) AGAINST('+miners +strike' IN BOOLEAN MODE) AND not exists(SELECT * FROM Keyword kw JOIN TitleKeyword sk ON kw.id=sk.keyword_id WHERE sk.title_id=t.id AND MATCH(kw.title) AGAINST('scargill' IN BOOLEAN MODE)) order by t.date\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: t
         type: fulltext
possible_keys: title
          key: title
      key_len: 0
          ref: 
         rows: 1
        Extra: Using where
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: kw
         type: fulltext
possible_keys: PRIMARY,title_2
          key: title_2
      key_len: 0
          ref: 
         rows: 1
        Extra: Using where
*************************** 3. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: sk
         type: eq_ref
possible_keys: PRIMARY,keyword_id
          key: PRIMARY
      key_len: 8
          ref: test.t.id,test.kw.id
         rows: 1
        Extra: Using index
3 rows in set (0.00 sec)

mysql> SELECT count(*) FROM LbcTitle t WHERE MATCH(t.title) AGAINST('+miners +strike' IN BOOLEAN MODE) AND not exists(SELECT * FROM Keyword kw JOIN TitleKeyword sk ON kw.id=sk.keyword_id WHERE sk.title_id=t.id AND kw.title LIKE '%scargill%') order by t.date\G
*************************** 1. row ***************************
count(*): 19
1 row in set (0.01 sec)
[27 Oct 2009 12:47] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/88339

3158 Georgi Kodinov	2009-10-27
      Bug #47930: MATCH IN BOOLEAN MODE returns too many results
        inside subquery
      
      Re-setting a fulltext index was a no-operation if not all
      the matches of a search were consumed by reading them.
      This was preventing a joined table using a fulltext index
      in a subquery that requires only 1 row of output (e.g. EXISTS) 
      from working correctly because the second execution of the 
      sub-query has the fulltext index cursor in a wrong state and
      was not finding results.
      Fixed by making the re-init code _ftb_init_index_search() 
      to re-set open cursors in addition to depleted ones.
[10 Nov 2009 13:08] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/89951

3195 Georgi Kodinov	2009-10-27
      Bug #47930: MATCH IN BOOLEAN MODE returns too many results
        inside subquery
      
      Re-setting a fulltext index was a no-operation if not all
      the matches of a search were consumed by reading them.
      This was preventing a joined table using a fulltext index
      in a subquery that requires only 1 row of output (e.g. EXISTS) 
      from working correctly because the second execution of the 
      sub-query has the fulltext index cursor in a wrong state and
      was not finding results.
      Fixed by making the re-init code _ftb_init_index_search() 
      to re-set open cursors in addition to depleted ones.
[2 Dec 2009 8:07] Bugs System
Pushed into 5.1.42 (revid:joro@sun.com-20091202080033-mndu4sxwx19lz2zs) (version source revid:joro@sun.com-20091027124312-2bfafz6ccdhzqhyd) (merge vers: 5.1.41) (pib:13)
[10 Dec 2009 2:19] Paul Dubois
Noted in 5.1.42 changelog.

MATCH IN BOOLEAN MODE searches could return too many results inside a
subquery. 

Setting report to NDI pending push to 5.5.x+.
[16 Dec 2009 8:37] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091216083311-xorsasf5kopjxshf) (version source revid:alik@sun.com-20091214191830-wznm8245ku8xo702) (merge vers: 6.0.14-alpha) (pib:14)
[16 Dec 2009 8:44] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091216082430-s0gtzibcgkv4pqul) (version source revid:alexey.kopytov@sun.com-20091124081906-6pqi7e7sajimog71) (merge vers: 5.5.0-beta) (pib:14)
[16 Dec 2009 8:50] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20091216083231-rp8ecpnvkkbhtb27) (version source revid:alik@sun.com-20091212203859-fx4rx5uab47wwuzd) (merge vers: 5.6.0-beta) (pib:14)
[18 Dec 2009 1:50] Paul Dubois
Noted in 5.5.1, 6.0.14 changelogs.
[12 Mar 2010 14:09] Bugs System
Pushed into 5.1.44-ndb-7.0.14 (revid:jonas@mysql.com-20100312135944-t0z8s1da2orvl66x) (version source revid:jonas@mysql.com-20100312115609-woou0te4a6s4ae9y) (merge vers: 5.1.44-ndb-7.0.14) (pib:16)
[12 Mar 2010 14:25] Bugs System
Pushed into 5.1.44-ndb-6.2.19 (revid:jonas@mysql.com-20100312134846-tuqhd9w3tv4xgl3d) (version source revid:jonas@mysql.com-20100312060623-mx6407w2vx76h3by) (merge vers: 5.1.44-ndb-6.2.19) (pib:16)
[12 Mar 2010 14:39] Bugs System
Pushed into 5.1.44-ndb-6.3.33 (revid:jonas@mysql.com-20100312135724-xcw8vw2lu3mijrhn) (version source revid:jonas@mysql.com-20100312103652-snkltsd197l7q2yg) (merge vers: 5.1.44-ndb-6.3.33) (pib:16)