Bug #115518 Lost a record while scanning in reverse order on cursor
Submitted: 5 Jul 3:26 Modified: 5 Jul 8:28
Reporter: Ruyi Zhang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:8.0.38,5.7.44,8.4,9.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: Contribution, cursor

[5 Jul 3:26] Ruyi Zhang
Description:
We had a cursor consistency test for replicas(based on redo replication), but we found some inconsistent issues with reverse scans that affected MySQL. Our test program will perform the following checks on a frequently updated table:
create table t(id int primary key, val varchar(3000));
```
set session transaction isolation level repeatable read;
begin;
select id from t order by id asc; #id_set1
select id from t order by id desc; #id_set2
commit;
check if #id_set1 == #id_set2
```
There are two situations in the test where records will be lost during reverse scanning.
1. #114248, If the prev leaf is freed and the record is moved to the current page during move_backward_from_page, the moved records will be lost.
2. If the last user record on the prev leaf is pessimistically updated during move_backward_from_page, the record may be deleted from the prev leaf and reinserted into the current page. Scanning in reverse order will miss this record.

How to repeat:
Scenario 2 can be reproduced by the following MTR:
diff --git a/storage/innobase/btr/btr0pcur.cc b/storage/innobase/btr/btr0pcur.cc
index 8745ead0124..7a733348001 100644
--- a/storage/innobase/btr/btr0pcur.cc
+++ b/storage/innobase/btr/btr0pcur.cc
@@ -375,6 +375,12 @@ void btr_pcur_t::move_backward_from_page(mtr_t *mtr) {

   mtr_commit(mtr);

+  DBUG_EXECUTE_IF("desc_scan_debug", {
+    if (strcmp(index()->table_name, "test/c") == 0) {
+      DEBUG_SYNC_C("desc_scan_before_restore_position");
+    }
+  });
+
   mtr_start(mtr);

   restore_position(latch_mode2, mtr, UT_LOCATION_HERE);
```
--source include/have_debug.inc

connect (con1, localhost, root,,);
--connection default
create table c(id int primary key, val varchar(8000));
insert into c values (1,repeat('a',3000));
insert into c values (2,repeat('a',3000));
insert into c values (3,repeat('a',3000));
insert into c values (4,repeat('a',3000));
insert into c values (5,repeat('a',3000));

insert into c values (-1,repeat('a',2000));
insert into c values (-2,repeat('a',2000));
insert into c values (-3,repeat('a',2000));
insert into c values (-4,repeat('a',2000));
insert into c values (-5,repeat('a',2000));

--connection con1
select id,length(val) from c order by id desc;
set DEBUG="+d,desc_scan_debug";
set DEBUG_SYNC="desc_scan_before_restore_position SIGNAL desc_scan_before_restore_position_done WAIT_FOR continue";
send select id,length(val) from c order by id desc;
--connection default
set DEBUG_SYNC="now WAIT_FOR desc_scan_before_restore_position_done";
update c set val=repeat('b',6000) where id=2;
set DEBUG_SYNC="now SIGNAL continue";
--connection con1
reap;
set DEBUG="-d,desc_scan_debug";
--connection default
drop table c;
```
mtr's result
```
create table c(id int primary key, val varchar(8000));
insert into c values (1,repeat('a',3000));
insert into c values (2,repeat('a',3000));
insert into c values (3,repeat('a',3000));
insert into c values (4,repeat('a',3000));
insert into c values (5,repeat('a',3000));
insert into c values (-1,repeat('a',2000));
insert into c values (-2,repeat('a',2000));
insert into c values (-3,repeat('a',2000));
insert into c values (-4,repeat('a',2000));
insert into c values (-5,repeat('a',2000));
select id,length(val) from c order by id desc;
id	length(val)
5	3000
4	3000
3	3000
2	3000
1	3000
-1	2000
-2	2000
-3	2000
-4	2000
-5	2000
set DEBUG="+d,desc_scan_debug";
set DEBUG_SYNC="desc_scan_before_restore_position SIGNAL desc_scan_before_restore_position_done WAIT_FOR continue";
select id,length(val) from c order by id desc;
set DEBUG_SYNC="now WAIT_FOR desc_scan_before_restore_position_done";
update c set val=repeat('b',6000) where id=2;
set DEBUG_SYNC="now SIGNAL continue";
id	length(val)
5	3000
4	3000
3	3000
1	3000
-1	2000
-2	2000
-3	2000
-4	2000
-5	2000
set DEBUG="-d,desc_scan_debug";
drop table c;
```
The Record 2 lost in reverse scan.

Suggested fix:
If the last user record in the page is pessimistically updated, increase the modify_clock of the inserted page to prevent reverse scanning from optimistically accessing the page.
[5 Jul 8:28] MySQL Verification Team
Hello Zhang Ruyi,

Thank you for the report and test case.

regards,
Umesh
[11 Jul 14:37] Ruyi Zhang
In this scenario, we try to disable optimistic access to the page to avoid this problem.

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: scan.patch (application/octet-stream, text), 4.95 KiB.