Bug #113751 | Non repeatable results in REPEATABLE-READ isolation level | ||
---|---|---|---|
Submitted: | 25 Jan 2024 7:03 | Modified: | 26 Jan 2024 10:40 |
Reporter: | Shlomi Noach (OCA) | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S3 (Non-critical) |
Version: | 8.0 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | DML, locking, REPEATABLE-READ, transaction |
[25 Jan 2024 7:03]
Shlomi Noach
[25 Jan 2024 10:57]
MySQL Verification Team
Hi Mr. Noach, Thank you for our bug report. These are the results that we get: Connection 1: set transaction_isolation='REPEATABLE-READ'; Query OK, 0 rows affected (0.00 sec) drop table if exists some_table; Query OK, 0 rows affected, 1 warning (0.00 sec) create table some_table ( id bigint, col varchar(10), primary key (id), unique index(col) ); Query OK, 0 rows affected (0.01 sec) insert into some_table(id, col) values (3, null), (4, "5"); Query OK, 2 rows affected (0.02 sec) Records: 2 Duplicates: 0 Warnings: 0 start transaction; Query OK, 0 rows affected (0.00 sec) select col from some_table where col in (5) or id in (3) for update; +------+ | col | +------+ | NULL | | 5 | +------+ Connection 2: set transaction_isolation='REPEATABLE-READ'; Query OK, 0 rows affected (0.00 sec) start transaction; Query OK, 0 rows affected (0.00 sec) select col from some_table where col in (('5')) for update; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction start transaction; Query OK, 0 rows affected (0.00 sec) select col from some_table where col in (('5')) for update; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction rollback; Query OK, 0 rows affected (0.00 sec) select col from some_table where col in (('5')) for update; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction rollback; Query OK, 0 rows affected (0.00 sec) We did not manage to repeat your results with our 8.0.36 production binaries, downloaded from dev.mysql.com. We tried without re-starting transaction, without rollback and we still get "lock wait timeout". In any possible variant. Connection 1: ------------- delete from some_table where col in (5) or id in (3); Query OK, 2 rows affected (0.00 sec) insert into some_table(id, col) values (3, 5); Query OK, 1 row affected (0.00 sec) commit; Query OK, 0 rows affected (0.00 sec) Everything is OK here ...... Connection 2: -------------- mysql> select col from some_table where col in (('5')) for update; +------+ | col | +------+ | 5 | +------+ 1 row in set (0.01 sec) mysql> select col from some_table where col in (('5')) for update; +------+ | col | +------+ | 5 | +------+ 1 row in set (0.00 sec) Again, we are not able to repeat your results. Can't repeat.
[25 Jan 2024 11:14]
Shlomi Noach
I see. I suspect your lock wait timeout is pretty low. Could you try: - Increasing lock wait timeout - And running Connection1's delete+insert+commit all within the wait timeout margins, before Connection2 times out?
[25 Jan 2024 11:22]
MySQL Verification Team
Hi, Actually, no ....... Our innodb_lock_wait_timeout is at 100. And yes, we ran Connection1's delete+insert+commit all immediately one after the other (copy / paste) ... We ran them all within a single second. We also repeated the same procedure once again, with the same result. Can't repeat.
[25 Jan 2024 11:36]
Shlomi Noach
That is strange indeed. This reproduces for my right now on 8.0.32 on my laptop. I can test later on on 8.0.36. Otherwise I don't know why this would not reproduce for you.
[25 Jan 2024 21:56]
Jean-François Gagné
My result with 8.0.36 and scripting that makes timing somehow reproducible. As reported by Shlomi, the 1st select of t2 (t2a) returns nothing, but the 2nd (t2b) returns one row. dbdeployer deploy single mysql_8.0.36 ./use <<< " create database test_jfg; create table test_jfg.some_table ( id bigint, col varchar(10), primary key (id), unique index(col) ); insert into test_jfg.some_table(id, col) values (3, null), (4, 5);" ( ./use -N test_jfg <<< " start transaction; select 't1', now(), col from some_table where col in (5) or id in (3) for update; do sleep(3); delete from some_table where col in (5) or id in (3); insert into some_table(id, col) values (3, 5); select 't1 done', now(); commit;" & ./use -N test_jfg <<< " do sleep(1); set transaction_isolation='REPEATABLE-READ'; start transaction; select 't2 starting', now(); select 't2a', now(), col from some_table where col in (('5')) for update; select 't2b', now(), col from some_table where col in (('5')) for update; rollback;" ) t1 2024-01-25 21:55:41 NULL t1 2024-01-25 21:55:41 5 t2 starting 2024-01-25 21:55:42 t1 done 2024-01-25 21:55:44 t2b 2024-01-25 21:55:44 5
[25 Jan 2024 22:15]
Jean-François Gagné
Interestingly, when I change the query from "where col in (('5'))" to "where col in (5)", I get a result closer to what is expected. ./use <<< " drop database if exists test_jfg; create database test_jfg; create table test_jfg.some_table ( id bigint, col varchar(10), primary key (id), unique index(col) ); insert into test_jfg.some_table(id, col) values (3, null), (4, 5);" ( ./use -N test_jfg <<< " start transaction; select 't1', now(), col from some_table where col in (5) or id in (3) for update; do sleep(3); delete from some_table where col in (5) or id in (3); insert into some_table(id, col) values (3, 5); select 't1 done', now(); commit;" & ./use -N test_jfg <<< " do sleep(1); set transaction_isolation='REPEATABLE-READ'; start transaction; select 't2 starting', now(); select 't2a', now(), col from some_table where col in (5) for update; select 't2b', now(), col from some_table where col in (5) for update; rollback;" ) t1 2024-01-25 22:13:55 NULL t1 2024-01-25 22:13:55 5 t2 starting 2024-01-25 22:13:56 t1 done 2024-01-25 22:13:58 t2a 2024-01-25 22:13:56 5 t2b 2024-01-25 22:13:58 5
[26 Jan 2024 10:40]
MySQL Verification Team
Hi All, When I used the method as provided by Monsieur Gagne, I managed to reproduce the results as reported: t1 2024-01-26 11:48:14 NULL t1 2024-01-26 11:48:14 5 t2 starting 2024-01-26 11:48:16 t1 done 2024-01-26 11:48:19 t2b 2024-01-26 11:48:19 5 The other problem of using 5 instead of '5' should not be considered a bug, since using the wrong constant for the column type leads to more complex problems. Verified as reported.
[26 Jan 2024 15:48]
Richard Burnison
By modifying Jean-François Gagné's test case a bit, the same problem also shows up on an `UPDATE` statement. In the following script, a `version` column is added to the table. I believe that the expected results for t2 in all cases would be `col='5',version=1` followed by `col='5',version=2`, however, the bug yields `col='5',version=0` followed by `col='5',version=1`. ``` mysql -h 127.0.0.1 --port 35801 -u root <<< " drop database if exists test_jfg; create database test_jfg; create table test_jfg.some_table ( id bigint, col varchar(10), version int default 0, primary key (id), unique index(col) ); insert into test_jfg.some_table(id, col) values (3, null), (4, 5);" ( mysql -h 127.0.0.1 --port 35801 -u root test_jfg <<< " start transaction; select 't1', now(), col from some_table where col in ('5') or id in (3) for update; do sleep(3); delete from some_table where col in ('5') or id in (3); insert into some_table(id, col) values (3, '5'); select 't1 done', now(); commit;" & mysql -h 127.0.0.1 --port 35801 -u root test_jfg <<< " do sleep(1); set transaction_isolation='REPEATABLE-READ'; start transaction; select 't2 starting', now(); update some_table set version = 1 where col in ('5'); select 't2a', now(), col, version from some_table; update some_table set version = 2 where col in ('5'); select 't2b', now(), col, version from some_table; rollback;" ) ``` In this case, the expected result is ``` t1 now() col t1 2024-01-26 10:33:44 NULL t1 2024-01-26 10:33:44 5 t2 starting now() t2 starting 2024-01-26 10:33:45 t1 done now() t1 done 2024-01-26 10:33:47 t2a now() col version t2a 2024-01-26 10:33:47 5 1 t2b now() col version t2b 2024-01-26 10:33:47 5 2 ``` However, the actual result in MySQL 8.0.36 is ``` t1 now() col t1 2024-01-26 10:34:34 NULL t1 2024-01-26 10:34:34 5 t2 starting now() t2 starting 2024-01-26 10:34:35 t1 done now() t1 done 2024-01-26 10:34:37 t2a now() col version t2a 2024-01-26 10:34:37 5 0 t2b now() col version t2b 2024-01-26 10:34:37 5 2 ``` From my analysis of the code, it appears the problem stems from the lack of a gap lock on `col='5'`, which is allowing the `DELETE` and `INSERT` to insert `('5',3)` before the delete-marked record being found in transaction 2, `('5',4)`. In this fashion, changing the test case to use `id=5` also yields the expected result: ``` mysql -h 127.0.0.1 --port 35801 -u root <<< " drop database if exists test_jfg; create database test_jfg; create table test_jfg.some_table ( id bigint, col varchar(10), version int default 0, primary key (id), unique index(col) ); insert into test_jfg.some_table(id, col) values (3, null), (4, 5);" ( mysql -h 127.0.0.1 --port 35801 -u root test_jfg <<< " start transaction; select 't1', now(), col from some_table where col in ('5') or id in (3) for update; do sleep(3); delete from some_table where col in ('5') or id in (3); insert into some_table(id, col) values (5, '5'); select 't1 done', now(); commit;" & mysql -h 127.0.0.1 --port 35801 -u root test_jfg <<< " do sleep(1); set transaction_isolation='REPEATABLE-READ'; start transaction; select 't2 starting', now(); update some_table set version = 1 where col in ('5'); select 't2a', now(), col, version from some_table; update some_table set version = 2 where col in ('5'); select 't2b', now(), col, version from some_table; rollback;" ) ``` shows the expected results, ``` t1 now() col t1 2024-01-26 10:37:48 NULL t1 2024-01-26 10:37:48 5 t2 starting now() t2 starting 2024-01-26 10:37:49 t1 done now() t1 done 2024-01-26 10:37:51 t2a now() col version t2a 2024-01-26 10:37:51 5 1 t2b now() col version t2b 2024-01-26 10:37:52 5 2 ``` When stepping through a debugger, I'm led to believe that this is happening because of the `goto next_rec` in https://github.com/mysql/mysql-server/blob/49ef33f7edadef3ae04665e73d1babd40179a4f1/storag... skipping the delete-marked record and advancing the cursor _ahead_ to `('5',5)` rather than backwards to `('5',3)`.
[29 Jan 2024 10:53]
MySQL Verification Team
Thank you, Mr. Burnison ...... We managed to repeat your test case too .......... We shall copy it to your internal bugs database.