Bug #115995 | Inconsistent database states at the serializable isolation level | ||
---|---|---|---|
Submitted: | 4 Sep 2024 7:36 | Modified: | 7 Sep 2024 7:11 |
Reporter: | Huicong Xu | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 8.0.36 | OS: | Ubuntu (22.04) |
Assigned to: | CPU Architecture: | Any |
[4 Sep 2024 7:36]
Huicong Xu
[4 Sep 2024 14:57]
MySQL Verification Team
Thank you for the report
[4 Sep 2024 15:01]
MySQL Verification Team
Bug #115978 is marked as duplicate of this one
[4 Sep 2024 16:04]
Jakub Lopuszanski
You did not specify any ORDER BY clause for you SELECTs so the DB is free to choose any 2 rows. Can you reproduce the same scenarios with: a) adding ORDER BY c0 b) defining c0 to be the PRIMARY KEY If not, then it's not a bug, IMHO
[5 Sep 2024 4:14]
Huicong Xu
Thanks for your suggestion. According to the SQL standard, the serializable isolation level does not require any ORDER BY clause for SELECTs. But our results do violate the serializable isolation level. I want to know what is the correct execution result of this statement when we are not using ORDER BY clause. After all, this statement is able to execute. And in my testing, I found that the order of the read set is in insertion order and not a free selection. Thank you.
[5 Sep 2024 8:14]
Jakub Lopuszanski
The execution which you see is, IMO, perfectly consistent with the serialization order in which which T2 happens before T1. T2 inserts 2 rows. Then T1 inserts 1 more row (so the table has 3 rows) and performs SELECT with LIMIT 2 which doesn't specify the ORDER BY, so the query can be satisfied by returning any 2 of the 3 rows. It happened to chose one row from each of these INSERTs.
[6 Sep 2024 7:10]
Huicong Xu
These days, I have tested other possible scenarios for this test case and the problem still exists after using the primary key and ORDER BY. Here's my test case. /* init */ DROP TABLE t1, t2; CREATE TABLE t1 (c0 INT); CREATE TABLE t2 (c0 INT PRIMARY KEY AUTO_INCREMENT, c1 INT); INSERT INTO t1 VALUES (1),(2),(3); /* T1 */BEGIN; /* T2 */BEGIN; /* T1 */INSERT INTO t2(c1) VALUES(1); /* T2 */INSERT INTO t2(c1) VALUES(2), (3); /* T2 */COMMIT; /* T1 */UPDATE t1 SET c0 = 10 WHERE c0 in (SELECT c1 FROM (SELECT * FROM t2 LIMIT 2) AS t); /* T1 */COMMIT; /* Finally database state */ mysql> SELECT * FROM t1; +------+ | c0 | +------+ | 10 | | 10 | | 3 | +------+ 3 rows in set (0.00 sec) This case has two serial execution results, T1 ahead or T2 ahead. When the execution order is T1 and T2, the result is as follows. /* init */ DROP TABLE t1, t2; CREATE TABLE t1 (c0 INT); CREATE TABLE t2 (c0 INT PRIMARY KEY AUTO_INCREMENT, c1 INT); INSERT INTO t1 VALUES (1),(2),(3); /* T1 */BEGIN; /* T1 */INSERT INTO t2(c1) VALUES(1); /* T1 */UPDATE t1 SET c0 = 10 WHERE c0 in (SELECT c1 FROM (SELECT * FROM t2 ORDER BY c0 LIMIT 2) AS t); /* T1 */COMMIT; /* T2 */BEGIN; /* T2 */INSERT INTO t2(c1) VALUES(2), (3); /* T2 */COMMIT; /* Finally database state */ mysql> SELECT * FROM t1; +------+ | c0 | +------+ | 10 | | 2 | | 3 | +------+ 3 rows in set (0.00 sec) When the execution order is T2 and T1, the result is as follows. /* init */ DROP TABLE t1, t2; CREATE TABLE t1 (c0 INT); CREATE TABLE t2 (c0 INT PRIMARY KEY AUTO_INCREMENT, c1 INT); INSERT INTO t1 VALUES (1),(2),(3); /* T2 */BEGIN; /* T2 */INSERT INTO t2(c1) VALUES(2), (3); /* T2 */COMMIT; /* T1 */BEGIN; /* T1 */INSERT INTO t2(c1) VALUES(1); /* T1 */UPDATE t1 SET c0 = 10 WHERE c0 in (SELECT c1 FROM (SELECT * FROM t2 ORDER BY c0 LIMIT 2) AS t); /* T1 */COMMIT; mysql> SELECT * FROM t1; +------+ | c0 | +------+ | 1 | | 10 | | 10 | +------+ 3 rows in set (0.00 sec)
[6 Sep 2024 15:22]
Jakub Lopuszanski
your test case lacks ORDER BY: /* T1 */UPDATE t1 SET c0 = 10 WHERE c0 in (SELECT c1 FROM (SELECT * FROM t2 LIMIT 2) AS t);
[7 Sep 2024 7:11]
Huicong Xu
I'm very sorry. It was a clerical error. When I use this statement (/* T1 */UPDATE t1 SET c0 = 10 WHERE c0 in (SELECT c1 FROM (SELECT * FROM t2 ORDER BY c0 LIMIT 2) AS t);), the error still occurs. /* init */ DROP TABLE t1, t2; CREATE TABLE t1 (c0 INT); CREATE TABLE t2 (c0 INT PRIMARY KEY AUTO_INCREMENT, c1 INT); INSERT INTO t1 VALUES (1),(2),(3); /* T1 */BEGIN; /* T2 */BEGIN; /* T1 */INSERT INTO t2(c1) VALUES(1); /* T2 */INSERT INTO t2(c1) VALUES(2), (3); /* T2 */COMMIT; /* T1 */UPDATE t1 SET c0 = 10 WHERE c0 in (SELECT c1 FROM (SELECT * FROM t2 ORDER BY c0 LIMIT 2) AS t); /* T1 */COMMIT; /* Finally database state */ mysql> SELECT * FROM t1; +------+ | c0 | +------+ | 10 | | 10 | | 3 | +------+ 3 rows in set (0.00 sec)
[9 Sep 2024 8:55]
Jakub Lopuszanski
Posted by developer: You indeed added an ORDER BY clause, but instead of `ORDER BY c1` which would make it depend on the deterministic values the user provided, you've used `ORDER BY c0` which sorts by AUTO INCREMENT values assigned by InnoDB logic. InnoDB has freedom to assign these values in any way as long as they are unique. In particular if you interleave two transactions which generate AUTO INCREMENT assigned values, then it is well known, that the assigned values might interleave in a way which can not be explained by any serialization of the two transactions. This is the trade-off of making auto increment work fast. Unfortunately, even with --innodb_autoinc_lock_mode=0 the AUTO_INC lock is released at the end of statement (as opposed to: at the end of transaction), so it will not help you. The only way to make auto increment values "serializable" would be to keep the lock on the generator of auto increment values until a transaction finishes, which would then lead to people reporting a scalability/performance bug that two concurrent inserts have to wait for each other (which I know, because people report it anyway for particular subscenarios in which they get blocked on GAP lock at the end of the table). Basically, you are better of as a designer of application if you treat auto increment values as "random but unique". See https://dev.mysql.com/doc/refman/8.0/en/innodb-auto-increment-handling.html Note, that you can trivially construct "non-serializable examples" by using non-deterministic functions like RAND() or NOW(), using them as ORDER BY criteria. This is how I perceive sorting by auto increment values, too. Now, observe that if you change it to `ORDER BY c1` then you still get: ``` mysql> SELECT * FROM t1; +------+ | c0 | +------+ | 10 | | 10 | | 3 | +------+ ``` but now it is equivalent to the sequence of T2 before T1. And this was my point: that you first need to make sure your queries are deterministic, only then it is worth discussing serialization order.