Bug #102999 InnoDB Fulltext search doesn't show some records after rollback to savepoint
Submitted: 16 Mar 2021 11:26 Modified: 21 Apr 2021 18:54
Reporter: Frank Ma Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: FULLTEXT search Severity:S3 (Non-critical)
Version:5.7.33 OS:Any
Assigned to: CPU Architecture:Any

[16 Mar 2021 11:26] Frank Ma
Description:
Problem
--------
There is a bug with savepoint handling of InnoDB FULLL-TEXT search, if several savepoints with the same
name are made within a transaction, after transaction rollbacks to some other savepoint (with different
name) and commits, some records may be missing in the full text search result.

In the repro steps provided below, 

1) "SELECT * FROM articles;" shows two records with id value 1 and 2,
2) "SELECT * FROM articles WHERE MATCH(title) AGAINST('mysql');" only shows 1 record with id value 1,
   while 2 records are expected.

Workaround
----------
1. Make savepoints with different (unique) names

How to repeat:

-------------------- repro SQL scripts begin ----------------------------
CREATE TABLE articles (
    FTS_DOC_ID BIGINT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
    id INT, 
    title VARCHAR(200),
    FULLTEXT (title))
ENGINE= InnoDB;

BEGIN;

INSERT INTO articles(id, title) VALUES(1, 'mysql');

SAVEPOINT p0;
SAVEPOINT p1;

INSERT INTO articles(id, title) VALUES(2, 'mysql');

SAVEPOINT p2;
SAVEPOINT p1;

INSERT INTO articles(id, title) VALUES(3, 'mysql');

ROLLBACK TO SAVEPOINT p2;

COMMIT;

SELECT * FROM articles;

SELECT * FROM articles WHERE MATCH(title) AGAINST('mysql');

-------------------- repro SQL scripts end ----------------------------

-------------------- SQL scripts output begin ----------------------------

mysql> SELECT * FROM articles;
+------------+------+-------+
| FTS_DOC_ID | id   | title |
+------------+------+-------+
|          1 |    1 | mysql |
|          2 |    2 | mysql |
+------------+------+-------+
2 rows in set (0.00 sec)

mysql>
mysql> SELECT * FROM articles WHERE MATCH(title) AGAINST('mysql');
+------------+------+-------+
| FTS_DOC_ID | id   | title |
+------------+------+-------+
|          1 |    1 | mysql |
+------------+------+-------+
1 row in set (0.00 sec)

mysql> SET GLOBAL innodb_ft_aux_table="sbtest/articles";
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_CACHE;
+-------+--------------+-------------+-----------+--------+----------+
| WORD  | FIRST_DOC_ID | LAST_DOC_ID | DOC_COUNT | DOC_ID | POSITION |
+-------+--------------+-------------+-----------+--------+----------+
| mysql |            1 |           1 |         1 |      1 |        0 |
+-------+--------------+-------------+-----------+--------+----------+
1 row in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_TABLE;
Empty set (0.00 sec)

-------------------- SQL scripts output end ----------------------------

Suggested fix:
Analysis
--------
When taking a savepoint, if a savepoint with the same name already exist, it will be released firstly,
then a new savepoint is created based on current transaction status.

Therefore, in the repro scripts, when "SAVEPOINT p1;" is run for the second time, previously created
savepoint with name "p1" would be released, since that savepoint contains references to records with
id value 1 (copy from savepoint p1) and 2 (created when record with id value 2 is inserted), now no
older savepoint (like savepoint p0) contains reference to record with id value 2.

Later, when running "ROLLBACK TO SAVEPOINT p2;", both savepont p1 and newly created p1 are released,
and p2 is re-created based on the current previous saveponit p0, which only have reference to record
with id value 1, that's why only one record shows up in FTS index cache.

Suggested fix
-------------
Currently, when releasing a savepoint, either due to "RELEASE SAVEPOINT xxx" or "SAVEPOINT" command, 
if it's the last savepoint, save all its red-black trees to the previous savepoint.

This should be done to any savepoint to be released, not just the last one.
[17 Mar 2021 13:14] MySQL Verification Team
Hi Mr. Ma,

Thank you for your bug report.

However, this is not a bug. SQL standard specifies in the sub-chapter 5.4, that savepoint identifiers must be unique within a transaction. Hence, the best that we can do is to verify this report as a feature request. This would lead to new behaviour, which would return the general error on wrongly set identifiers within a transaction.

However, that feature will not be implemented that soon, since it would break a number of applications.

So, it is either a feature request or "Not a bug".
[19 Mar 2021 2:57] Frank Ma
Well, the facts are

1) using same savepoint name within a transaction is allowed, when a savepoint is taken, the existing one with the same name will be
   erased, this is what innodb does, as well as other db vendors 

2) innodb engine works OK, from the test output mentioned in the repro, stattement "SELECT * FROM articles;" shows two records, which
   is expected, this also confirms 1)

3) the problem here is only with full-text search, references to some records are lost when releasing the savepoint maintained internally
   in fts, this is the root cause.
[19 Mar 2021 13:15] MySQL Verification Team
Hi,

Please reply to the questions that we have asked in our previous comment.
[20 Apr 2021 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".