Description:
After a crash, InnoDB Full-Text Search (FTS) recovery may silently skip the first unsynced document, causing it to become invisible to full-text queries even though it exists in the base table.
Root cause:
fts_update_sync_doc_id() (fts0fts.cc line 3002) writes (doc_id + 1) to the FTS CONFIG table:
id_len = snprintf((char *)id, sizeof(id), FTS_DOC_ID_FORMAT, doc_id + 1);
During crash recovery, fts_init_index() (fts0fts.cc line 6302-6303) calls fts_cmp_set_sync_doc_id(table, 0, true, &start_doc) with read_only=true. In the read_only path (fts0fts.cc line 2902-2904), the function returns the raw value read from the CONFIG table WITHOUT subtracting 1:
if (read_only) {
goto func_exit; // returns doc_id + 1 as-is
}
Note that in the read_only=false path (fts0fts.cc line 2906-2907), the code correctly subtracts 1:
if (doc_id_cmp == 0 && *doc_id) {
cache->synced_doc_id = *doc_id - 1; // correct adjustment
}
The start_doc is then used as the lower bound for the recovery query (fts0fts.cc line 3964):
SELECT ... FROM $table_name WHERE FTS_DOC_ID > :doc_id
Example:
If 3 documents (FTS_DOC_ID = 1, 2, 3) are synced, the CONFIG table stores synced_doc_id = 4 (i.e. 3 + 1). Then a document with FTS_DOC_ID = 4 is inserted but not synced before a crash occurs.
- Without fix: start_doc = 4, recovery query is "FTS_DOC_ID > 4", document 4 is SKIPPED
- With fix: start_doc = 4 - 1 = 3, recovery query is "FTS_DOC_ID > 3", document 4 is RECOVERED
The impact is that users may observe full-text search results that are inconsistent with the base table data after a crash. A "SELECT ... WHERE MATCH(...) AGAINST(...)" will not find the document, while a regular "SELECT" shows it exists. This inconsistency persists until the next OPTIMIZE TABLE or a manual FTS index rebuild.
How to repeat:
Run the following test script to reproduce the issue. After crash recovery, the document with FTS_DOC_ID=4 remains present in the base table but is no longer discoverable through full-text search:
SELECT * FROM t1 WHERE MATCH(content) AGAINST('beginning')
--echo #
--echo # Test: fts_init_index synced_doc_id off-by-one fix
--echo # Verify that after sync + crash, the recovery path correctly
--echo # restores documents that were inserted after the last sync.
--echo # The bug was that fts_update_sync_doc_id writes (doc_id + 1) to
--echo # CONFIG table, but fts_init_index read it back without subtracting 1,
--echo # causing the first unsynced document to be skipped.
--echo #
CREATE TABLE t1 (
FTS_DOC_ID BIGINT UNSIGNED NOT NULL PRIMARY KEY,
content TEXT(500),
title VARCHAR(200)
) ENGINE=InnoDB;
CREATE FULLTEXT INDEX ft_content ON t1(content);
CREATE FULLTEXT INDEX ft_title ON t1(title);
INSERT INTO t1(FTS_DOC_ID, content, title) VALUES
(1, 'The quick brown fox jumps over the lazy dog', 'Fox Story'),
(2, 'A journey of a thousand miles begins with a single step', 'Journey'),
(3, 'To be or not to be that is the question', 'Hamlet');
--echo # Verify initial FTS search works
SELECT * FROM t1 WHERE MATCH(content) AGAINST('fox');
SELECT * FROM t1 WHERE MATCH(content) AGAINST('journey');
--echo # Force sync via OPTIMIZE TABLE to flush FTS cache to disk
--echo # After this, CONFIG table synced_doc_id = max_doc_id(3) + 1 = 4
SET GLOBAL innodb_optimize_fulltext_only=ON;
OPTIMIZE TABLE t1;
SET GLOBAL innodb_optimize_fulltext_only=OFF;
--echo # Now insert a new row with FTS_DOC_ID=4 - this one will NOT be synced
INSERT INTO t1(FTS_DOC_ID, content, title) VALUES
(4, 'In the beginning God created the heavens and the earth', 'Genesis');
--echo # Verify it exists in the table
SELECT * FROM t1 WHERE FTS_DOC_ID = 4;
--echo # Kill the server without giving FTS a chance to sync the new row
--source include/kill_mysqld.inc
--echo # Restart the server - fts_init_index will be triggered on first FTS query
--source include/start_mysqld.inc
--echo # After restart, fts_init_index reads synced_doc_id from CONFIG table.
--echo # CONFIG table has value 4 (= actual synced max_doc_id 3 + 1).
--echo # Without the fix: start_doc = 4, query "FTS_DOC_ID > 4" skips doc_id=4
--echo # With the fix: start_doc = 4 - 1 = 3, query "FTS_DOC_ID > 3" recovers doc_id=4
--echo # Verify previously synced documents are still searchable
SELECT * FROM t1 WHERE MATCH(content) AGAINST('fox');
SELECT * FROM t1 WHERE MATCH(content) AGAINST('journey');
--echo # Critical check: verify the unsynced document (FTS_DOC_ID=4) is recovered
SELECT * FROM t1 WHERE MATCH(content) AGAINST('beginning');
SELECT * FROM t1 WHERE MATCH(content) AGAINST('heavens');
--echo # Verify all rows are present
SELECT * FROM t1;
--echo # Cleanup
DROP TABLE t1;
Suggested fix:
In fts_init_index() (storage/innobase/fts/fts0fts.cc), after reading start_doc via fts_cmp_set_sync_doc_id() in read-only mode, subtract 1 to compensate for the +1 that fts_update_sync_doc_id() adds when persisting.
Description: After a crash, InnoDB Full-Text Search (FTS) recovery may silently skip the first unsynced document, causing it to become invisible to full-text queries even though it exists in the base table. Root cause: fts_update_sync_doc_id() (fts0fts.cc line 3002) writes (doc_id + 1) to the FTS CONFIG table: id_len = snprintf((char *)id, sizeof(id), FTS_DOC_ID_FORMAT, doc_id + 1); During crash recovery, fts_init_index() (fts0fts.cc line 6302-6303) calls fts_cmp_set_sync_doc_id(table, 0, true, &start_doc) with read_only=true. In the read_only path (fts0fts.cc line 2902-2904), the function returns the raw value read from the CONFIG table WITHOUT subtracting 1: if (read_only) { goto func_exit; // returns doc_id + 1 as-is } Note that in the read_only=false path (fts0fts.cc line 2906-2907), the code correctly subtracts 1: if (doc_id_cmp == 0 && *doc_id) { cache->synced_doc_id = *doc_id - 1; // correct adjustment } The start_doc is then used as the lower bound for the recovery query (fts0fts.cc line 3964): SELECT ... FROM $table_name WHERE FTS_DOC_ID > :doc_id Example: If 3 documents (FTS_DOC_ID = 1, 2, 3) are synced, the CONFIG table stores synced_doc_id = 4 (i.e. 3 + 1). Then a document with FTS_DOC_ID = 4 is inserted but not synced before a crash occurs. - Without fix: start_doc = 4, recovery query is "FTS_DOC_ID > 4", document 4 is SKIPPED - With fix: start_doc = 4 - 1 = 3, recovery query is "FTS_DOC_ID > 3", document 4 is RECOVERED The impact is that users may observe full-text search results that are inconsistent with the base table data after a crash. A "SELECT ... WHERE MATCH(...) AGAINST(...)" will not find the document, while a regular "SELECT" shows it exists. This inconsistency persists until the next OPTIMIZE TABLE or a manual FTS index rebuild. How to repeat: Run the following test script to reproduce the issue. After crash recovery, the document with FTS_DOC_ID=4 remains present in the base table but is no longer discoverable through full-text search: SELECT * FROM t1 WHERE MATCH(content) AGAINST('beginning') --echo # --echo # Test: fts_init_index synced_doc_id off-by-one fix --echo # Verify that after sync + crash, the recovery path correctly --echo # restores documents that were inserted after the last sync. --echo # The bug was that fts_update_sync_doc_id writes (doc_id + 1) to --echo # CONFIG table, but fts_init_index read it back without subtracting 1, --echo # causing the first unsynced document to be skipped. --echo # CREATE TABLE t1 ( FTS_DOC_ID BIGINT UNSIGNED NOT NULL PRIMARY KEY, content TEXT(500), title VARCHAR(200) ) ENGINE=InnoDB; CREATE FULLTEXT INDEX ft_content ON t1(content); CREATE FULLTEXT INDEX ft_title ON t1(title); INSERT INTO t1(FTS_DOC_ID, content, title) VALUES (1, 'The quick brown fox jumps over the lazy dog', 'Fox Story'), (2, 'A journey of a thousand miles begins with a single step', 'Journey'), (3, 'To be or not to be that is the question', 'Hamlet'); --echo # Verify initial FTS search works SELECT * FROM t1 WHERE MATCH(content) AGAINST('fox'); SELECT * FROM t1 WHERE MATCH(content) AGAINST('journey'); --echo # Force sync via OPTIMIZE TABLE to flush FTS cache to disk --echo # After this, CONFIG table synced_doc_id = max_doc_id(3) + 1 = 4 SET GLOBAL innodb_optimize_fulltext_only=ON; OPTIMIZE TABLE t1; SET GLOBAL innodb_optimize_fulltext_only=OFF; --echo # Now insert a new row with FTS_DOC_ID=4 - this one will NOT be synced INSERT INTO t1(FTS_DOC_ID, content, title) VALUES (4, 'In the beginning God created the heavens and the earth', 'Genesis'); --echo # Verify it exists in the table SELECT * FROM t1 WHERE FTS_DOC_ID = 4; --echo # Kill the server without giving FTS a chance to sync the new row --source include/kill_mysqld.inc --echo # Restart the server - fts_init_index will be triggered on first FTS query --source include/start_mysqld.inc --echo # After restart, fts_init_index reads synced_doc_id from CONFIG table. --echo # CONFIG table has value 4 (= actual synced max_doc_id 3 + 1). --echo # Without the fix: start_doc = 4, query "FTS_DOC_ID > 4" skips doc_id=4 --echo # With the fix: start_doc = 4 - 1 = 3, query "FTS_DOC_ID > 3" recovers doc_id=4 --echo # Verify previously synced documents are still searchable SELECT * FROM t1 WHERE MATCH(content) AGAINST('fox'); SELECT * FROM t1 WHERE MATCH(content) AGAINST('journey'); --echo # Critical check: verify the unsynced document (FTS_DOC_ID=4) is recovered SELECT * FROM t1 WHERE MATCH(content) AGAINST('beginning'); SELECT * FROM t1 WHERE MATCH(content) AGAINST('heavens'); --echo # Verify all rows are present SELECT * FROM t1; --echo # Cleanup DROP TABLE t1; Suggested fix: In fts_init_index() (storage/innobase/fts/fts0fts.cc), after reading start_doc via fts_cmp_set_sync_doc_id() in read-only mode, subtract 1 to compensate for the +1 that fts_update_sync_doc_id() adds when persisting.