Bug #117436 PCursor::move_to_next_block may skip records incorrectly
Submitted: 11 Feb 7:22 Modified: 12 Feb 3:46
Reporter: Ruyi Zhang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: cursor parallel-scan

[11 Feb 7:22] Ruyi Zhang
Description:
Hello, Our team (Bytedance ndb) recently analyzed the Release Notes of 8.0.41 and are glad to see that some bugs affecting data correctness have been fixed. 

However, our analysis found that the following problem still seems to exist.

```
InnoDB: An ALTER TABLE operation that rebuilt an InnoDB table using the INPLACE algorithm potentially led to losing a row of data if a purge occurred concurrently on the altered table that contained a clustered or spatial index.
```

The corresponding commit: https://github.com/mysql/mysql-server/commit/1badccd9eabddb16a13e0816cd95ed102dcb303d.

In the function dberr_t `PCursor::move_to_next_block(dict_index_t *index)`, new handling for `save_previous_user_record_as_last_processed` and `restore_to_first_unprocessed` when `m_read_level`==0 has been added.

When this function is called, the cursor always points to the `supremum`. 

Therefore, if no records on the page have been deleted, the cursor will still point to the `supremum` after the `restore_position` call.

The abnormal case is: if SMO occurs before `restore_to_first_unprocessed`, that is, the records in the next block are merged into the current block. 

After `restore_to_first_unprocessed`, `PCursor::move_to_next_block` will incorrectly ignore the records moved during the SMO because the cursor still points to the `supremum`.

How to repeat:
Code analysis
[11 Feb 12:21] MySQL Verification Team
Hi Mr. Zhang,

Thank you for your bug report.

We have analysed your report closely.

We agree with your findings.

This is now a bug verified by code analysis.
[12 Feb 3:46] Ruyi Zhang
Later, we wrote an MTR to confirm this issue:

We modified the `Parallel_reader::add_scan` so that it always generates only one range. 

This will facilitate us to reproduce the issue in scenarios where the height of the BTREE is relatively low.

After record 9 is deleted, the records in leaf 6 will be merged to the left into leaf 7, and leaf 6 will be reclaimed. 

`restore_to_first_unprocessed` fails to properly handle the records merged during the `SMO`, resulting in the loss of these records during the scan.

```
diff --git a/mysql-test/suite/innodb/t/parallel_scan_miss_rec.test b/mysql-test/suite/innodb/t/parallel_scan_miss_rec.test
index e69de29bb2d..6b0d12bbe53 100644
--- a/mysql-test/suite/innodb/t/parallel_scan_miss_rec.test
+++ b/mysql-test/suite/innodb/t/parallel_scan_miss_rec.test
@@ -0,0 +1,50 @@
+# after insert 1,2,3,7,8,10
+#           root(4)
+# leaf(5)[1,2] - leaf(6)[3,7,8,10]
+
+# after insert 9
+#                root(4)
+# leaf(5)[1,2] - leaf(7)[3,7] - leaf(6)[8,9,10]
+
+# after delete 9
+#
+# leaf(5)[1,2] - leaf(7)[3,7,8,10]
+
+--source include/have_debug.inc
+--source include/have_debug_sync.inc
+
+--connect(con1, localhost, root,,)
+
+--connection default
+create table page_test(id int primary key, val varchar(3300));
+
+insert into page_test values (1, repeat('a', 3300));
+insert into page_test values (2, repeat('a', 3300));
+insert into page_test values (3, repeat('a', 3300));
+insert into page_test values (7, repeat('a', 3300));
+insert into page_test values (8, repeat('a', 3300));
+insert into page_test values (10, repeat('a', 3300));
+
+SET GLOBAL innodb_purge_stop_now = ON;
+insert into page_test values (9, repeat('a', 3300));
+delete from page_test where id=9;
+
+--connection con1
+set global debug="+d,pcursor_move_to_next_block_release_latches";
+set DEBUG_SYNC="pcursor_move_to_next_block_latches_released SIGNAL latches_released WAIT_FOR continue EXECUTE 2";
+send ALTER TABLE page_test ENGINE=InnoDB, ALGORITHM=INPLACE;
+--connection default
+set DEBUG_SYNC="now WAIT_FOR latches_released";
+set DEBUG_SYNC="now SIGNAL continue";
+
+set DEBUG_SYNC="now WAIT_FOR latches_released";
+SET GLOBAL innodb_purge_run_now = ON;
+--source include/wait_innodb_all_purged.inc
+set DEBUG_SYNC="now SIGNAL continue";
+
+--connection con1
+reap;
+--connection default
+set global debug="-d,pcursor_move_to_next_block_release_latches";
+select id from page_test;
+drop table page_test;
\ No newline at end of file
diff --git a/storage/innobase/row/row0pread.cc b/storage/innobase/row/row0pread.cc
index 46e1075763e..c471a866c0d 100644
--- a/storage/innobase/row/row0pread.cc
+++ b/storage/innobase/row/row0pread.cc
@@ -1549,6 +1549,11 @@ dberr_t Parallel_reader::add_scan(trx_t *trx,
     return (err);
   }

+ if (ranges.size() > 0) {
+   ranges.erase(ranges.begin() + 1, ranges.end());
+   ranges.back().second = std::make_shared<Parallel_reader::Scan_ctx::Iter>();
+ }
+
   err = scan_ctx->create_contexts(ranges);

   scan_ctx->index_s_unlock();
```

After the MTR runs the INPLACE DDL, records 8 and 10 will be missing from the new table.