| Bug #74686 | Wrong relevance ranking for InnoDB full text searches under certain conditions | ||
|---|---|---|---|
| Submitted: | 4 Nov 2014 15:29 | Modified: | 10 Mar 2015 17:52 |
| Reporter: | Tim McLaughlin | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: FULLTEXT search | Severity: | S2 (Serious) |
| Version: | 5.6.21 | OS: | Linux |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | full text, fulltext, innodb, unique index | ||
[4 Nov 2014 16:20]
MySQL Verification Team
C:\dbs>c:\dbs\5.6\bin\mysql -uroot --port=3560 --debug-info --prompt="mysql 5.6 > "
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.22-log Source distribution
Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql 5.6 > CREATE DATABASE dx;
Query OK, 1 row affected (0.00 sec)
mysql 5.6 > USE dx
Database changed
mysql 5.6 > CREATE TABLE `characters` (
-> `id` int(11) NOT NULL AUTO_INCREMENT,
-> `first_name` varchar(50) NOT NULL,
-> `last_name` varchar(50) NOT NULL,
-> PRIMARY KEY (`id`),
-> UNIQUE KEY `idx_1` (`first_name`, `last_name`),
-> FULLTEXT KEY `idx_2` (`first_name`)
-> ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.31 sec)
mysql 5.6 > INSERT INTO `characters` (`id`, `first_name`, `last_name`) VALUES
-> (1, 'Bart', 'Simpson'),
-> (2, 'Homer', 'Simpson'),
-> (3, 'Marge', 'Simpson'),
-> (4, 'Lisa', 'Simpson'),
-> (5, 'Maggie', 'Simpson'),
-> (6, 'Ned', 'Flanders'),
-> (7, 'Nelson', 'Muntz');
Query OK, 7 rows affected (0.03 sec)
Records: 7 Duplicates: 0 Warnings: 0
mysql 5.6 > # Perform a full text search for "Homer".
mysql 5.6 > # This returns the wrong results; Homer Simpson's score should be > 0
mysql 5.6 > SELECT id, first_name, last_name, MATCH(first_name) AGAINST('Homer' IN BOOLEAN MODE) AS score FROM characters;
+----+------------+-----------+-------+
| id | first_name | last_name | score |
+----+------------+-----------+-------+
| 1 | Bart | Simpson | 0 |
| 2 | Homer | Simpson | 0 |
| 4 | Lisa | Simpson | 0 |
| 5 | Maggie | Simpson | 0 |
| 3 | Marge | Simpson | 0 |
| 6 | Ned | Flanders | 0 |
| 7 | Nelson | Muntz | 0 |
+----+------------+-----------+-------+
7 rows in set (0.02 sec)
mysql 5.6 > ALTER TABLE characters ENGINE=MyISAM;
Query OK, 7 rows affected (0.30 sec)
Records: 7 Duplicates: 0 Warnings: 0
mysql 5.6 > SELECT id, first_name, last_name, MATCH(first_name) AGAINST('Homer' IN BOOLEAN MODE) AS score FROM characters;
+----+------------+-----------+-------+
| id | first_name | last_name | score |
+----+------------+-----------+-------+
| 1 | Bart | Simpson | 0 |
| 2 | Homer | Simpson | 1 |
| 3 | Marge | Simpson | 0 |
| 4 | Lisa | Simpson | 0 |
| 5 | Maggie | Simpson | 0 |
| 6 | Ned | Flanders | 0 |
| 7 | Nelson | Muntz | 0 |
+----+------------+-----------+-------+
7 rows in set (0.00 sec)
[4 Nov 2014 16:24]
MySQL Verification Team
Thank you for the bug report. Not repeatable on 5.7 source server.
C:\dbs>c:\dbs\5.7\bin\mysql -uroot --port=3570 --prompt="mysql 5.7 > "
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.6-m16 Source distribution
Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql 5.7 > CREATE DATABASE dx;
Query OK, 1 row affected (0.00 sec)
mysql 5.7 > USE dx
Database changed
mysql 5.7 > CREATE TABLE `characters` (
-> `id` int(11) NOT NULL AUTO_INCREMENT,
-> `first_name` varchar(50) NOT NULL,
-> `last_name` varchar(50) NOT NULL,
-> PRIMARY KEY (`id`),
-> UNIQUE KEY `idx_1` (`first_name`, `last_name`),
-> FULLTEXT KEY `idx_2` (`first_name`)
-> ) ENGINE=InnoDB;
Query OK, 0 rows affected (1.72 sec)
mysql 5.7 > INSERT INTO `characters` (`id`, `first_name`, `last_name`) VALUES
-> (1, 'Bart', 'Simpson'),
-> (2, 'Homer', 'Simpson'),
-> (3, 'Marge', 'Simpson'),
-> (4, 'Lisa', 'Simpson'),
-> (5, 'Maggie', 'Simpson'),
-> (6, 'Ned', 'Flanders'),
-> (7, 'Nelson', 'Muntz');
Query OK, 7 rows affected (0.05 sec)
Records: 7 Duplicates: 0 Warnings: 0
mysql 5.7 > # Perform a full text search for "Homer".
mysql 5.7 > # This returns the wrong results; Homer Simpson's score should be > 0
mysql 5.7 > SELECT id, first_name, last_name, MATCH(first_name) AGAINST('Homer' IN BOOLEAN MODE) AS score FROM characters;
+----+------------+-----------+--------------------+
| id | first_name | last_name | score |
+----+------------+-----------+--------------------+
| 1 | Bart | Simpson | 0 |
| 2 | Homer | Simpson | 0.7141907215118408 |
| 3 | Marge | Simpson | 0 |
| 4 | Lisa | Simpson | 0 |
| 5 | Maggie | Simpson | 0 |
| 6 | Ned | Flanders | 0 |
| 7 | Nelson | Muntz | 0 |
+----+------------+-----------+--------------------+
7 rows in set (0.00 sec)
[10 Mar 2015 17:52]
Paul DuBois
Noted in 5.6.25, 5.7.7, 5.8.0 changelogs. For full-text searches, the optimizer could choose an index not appropriate for producing correct relevancy rankings.
[23 Jun 2015 15:26]
Laurynas Biveinis
commit 27362717a24d4563774f2a99caab7bf13546da93
Author: Chaithra Gopalareddy <chaithra.gopalareddy@oracle.com>
Date: Thu Feb 26 15:53:52 2015 +0530
Bug #19950568: WRONG RELEVANCE RANKING FOR INNODB FULL TEXT SEARCHES UNDER
CERTAIN CONDITIONS
Problem:
Optimizer chooses a wrong index to fetch the doc_id which is
present only in a full-text index
Analysis:
For the testcase presented, score can be got only when idx2 is
used. But optimizer checks that idx_1 is covering and sets
that index to be used. Since doc_id is needed to get score,
and as doc_id can only be got from idx_2, optimizer
should not set idx_1 as covering.
Solution:
This is backport from 5.7. As part of the solution a new field
is added in table structure and is set when FTS_DOC_ID field
is present. no_keyread is set to true in this case indicating
that no key can be set as covering.
[23 Jun 2015 15:32]
Laurynas Biveinis
commit 76568a2799764fff566a95edcd3b7c8aee9a075d
Author: Chaithra Gopalareddy <chaithra.gopalareddy@oracle.com>
Date: Mon Mar 2 13:47:00 2015 +0530
Bug#19950568 - WRONG RELEVANCE RANKING FOR INNODB FULL TEXT SEARCHES
UNDER CERTAIN CONDITIONS
Post push fix for test case failure

Description: An InnoDB boolean mode full text search produces the wrong relevancy ranking in some cases. It appears to happen when there's a separate unique index and full text index that includes the same field. How to repeat: DROP TABLE IF EXISTS `characters`; # Create a table with a fulltext and unique index that includes # the same column (i.e. first_name) CREATE TABLE `characters` ( `id` int(11) NOT NULL AUTO_INCREMENT, `first_name` varchar(50) NOT NULL, `last_name` varchar(50) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `idx_1` (`first_name`, `last_name`), FULLTEXT KEY `idx_2` (`first_name`) ) ENGINE=InnoDB; INSERT INTO `characters` (`id`, `first_name`, `last_name`) VALUES (1, 'Bart', 'Simpson'), (2, 'Homer', 'Simpson'), (3, 'Marge', 'Simpson'), (4, 'Lisa', 'Simpson'), (5, 'Maggie', 'Simpson'), (6, 'Ned', 'Flanders'), (7, 'Nelson', 'Muntz'); # Perform a full text search for "Homer". # This returns the wrong results; Homer Simpson's score should be > 0 SELECT id, first_name, last_name, MATCH(first_name) AGAINST('Homer' IN BOOLEAN MODE) AS score FROM characters; +----+------------+-----------+-------+ | id | first_name | last_name | score | +----+------------+-----------+-------+ | 1 | Bart | Simpson | 0 | | 2 | Homer | Simpson | 0 | | 4 | Lisa | Simpson | 0 | | 5 | Maggie | Simpson | 0 | | 3 | Marge | Simpson | 0 | | 6 | Ned | Flanders | 0 | | 7 | Nelson | Muntz | 0 | +----+------------+-----------+-------+ # Run the same full text search as above except add "ORDER BY id". # This returns the expected results. SELECT id, first_name, last_name, MATCH(first_name) AGAINST('Homer' IN BOOLEAN MODE) AS score FROM characters ORDER BY id; +----+------------+-----------+--------------------+ | id | first_name | last_name | score | +----+------------+-----------+--------------------+ | 1 | Bart | Simpson | 0 | | 2 | Homer | Simpson | 0.7141907215118408 | | 3 | Marge | Simpson | 0 | | 4 | Lisa | Simpson | 0 | | 5 | Maggie | Simpson | 0 | | 6 | Ned | Flanders | 0 | | 7 | Nelson | Muntz | 0 | +----+------------+-----------+--------------------+ # Run the same full text search as above except add "ORDER BY score". # This returns the expected results. SELECT id, first_name, last_name, MATCH(first_name) AGAINST('Homer' IN BOOLEAN MODE) AS score FROM characters ORDER BY score; +----+------------+-----------+--------------------+ | id | first_name | last_name | score | +----+------------+-----------+--------------------+ | 1 | Bart | Simpson | 0 | | 3 | Marge | Simpson | 0 | | 4 | Lisa | Simpson | 0 | | 5 | Maggie | Simpson | 0 | | 6 | Ned | Flanders | 0 | | 7 | Nelson | Muntz | 0 | | 2 | Homer | Simpson | 0.7141907215118408 | +----+------------+-----------+--------------------+ # Repeat the original full text search # Wrong results; this time all scores are > 0 SELECT id, first_name, last_name, MATCH(first_name) AGAINST('Homer' IN BOOLEAN MODE) AS score FROM characters; +----+------------+-----------+--------------------+ | id | first_name | last_name | score | +----+------------+-----------+--------------------+ | 1 | Bart | Simpson | 0.7141907215118408 | | 2 | Homer | Simpson | 0.7141907215118408 | | 4 | Lisa | Simpson | 0.7141907215118408 | | 5 | Maggie | Simpson | 0.7141907215118408 | | 3 | Marge | Simpson | 0.7141907215118408 | | 6 | Ned | Flanders | 0.7141907215118408 | | 7 | Nelson | Muntz | 0.7141907215118408 | +----+------------+-----------+--------------------+ # Drop the unique index ALTER TABLE `characters` DROP INDEX `idx_1`; # The original full text search returns the expected results SELECT id, first_name, last_name, MATCH(first_name) AGAINST('Homer' IN BOOLEAN MODE) AS score FROM characters; +----+------------+-----------+--------------------+ | id | first_name | last_name | score | +----+------------+-----------+--------------------+ | 1 | Bart | Simpson | 0 | | 2 | Homer | Simpson | 0.7141907215118408 | | 3 | Marge | Simpson | 0 | | 4 | Lisa | Simpson | 0 | | 5 | Maggie | Simpson | 0 | | 6 | Ned | Flanders | 0 | | 7 | Nelson | Muntz | 0 | +----+------------+-----------+--------------------+ # The full text search with the "ORDER BY score" clause returns # the expected results as well. SELECT id, first_name, last_name, MATCH(first_name) AGAINST('Homer' IN BOOLEAN MODE) AS score FROM characters ORDER BY score; +----+------------+-----------+--------------------+ | id | first_name | last_name | score | +----+------------+-----------+--------------------+ | 1 | Bart | Simpson | 0 | | 3 | Marge | Simpson | 0 | | 4 | Lisa | Simpson | 0 | | 5 | Maggie | Simpson | 0 | | 6 | Ned | Flanders | 0 | | 7 | Nelson | Muntz | 0 | | 2 | Homer | Simpson | 0.7141907215118408 | +----+------------+-----------+--------------------+ DROP TABLE IF EXISTS `characters`;