MySQL Bugs: #102752: plain SELECT reads different results in the same transaction

Bug #102752	plain SELECT reads different results in the same transaction
Submitted:	26 Feb 2021 13:40	Modified:	2 Mar 2021 2:47
Reporter:	Yushan ZHANG	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S3 (Non-critical)
Version:	8.0	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
From the documentation https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html we see:

.... Consistent reads within the same transaction read the snapshot established by the first read. This means that if you issue several plain (nonlocking) SELECT statements within the same transaction, these SELECT statements are consistent also with respect to each other. ....

However, I could get different read results of the same query:

mysql> select pk from t0;
+----+
| pk |
+----+
|  0 |
|  1 |
+----+
2 rows in set (0.00 sec)

---- two rows, but actually three rows
mysql> update t0 set col0 = 1 where pk > 0; 
Query OK, 2 rows affected (0.00 sec)
Rows matched: 2  Changed: 2  Warnings: 0

mysql> select pk from t0; ---- different?
+----+
| pk |
+----+
|  0 |
|  1 |
|  2 |
+----+
3 rows in set (0.00 sec)

mysql> select * from t0; -- ???
+----+------+------+
| pk | col0 | col1 |
+----+------+------+
|  0 |    1 |    2 |
|  1 |    1 |    3 |
|  2 |    1 |    5 |
+----+------+------+
3 rows in set (0.00 sec)

How to repeat:
drop table if exists t0;

create table t0 (
  pk int primary key, 
  col0 int, 
  col1 int
); 

insert into t0 values (0,1,2), (1,2,3);

---- T1
start transaction;
select pk from t0;
insert into t0 value (2,3,5);
commit;

---- T2
start transaction;
select pk from t0;
insert into t0 value (2,3,5);
select pk from t0;
update t0 set col0 = 1 where pk > 0;
select pk from t0; ---- different?
commit;

-------- how to manually reproduce --------

-- T1 --			-- T2 --

start transaction;		
				start transaction; 
select pk from t0;		
insert into t0 value (2,3,5);		
				select pk from t0; ---- first read
				insert into t0 value (2,3,5); ---- failed
commit;
				select pk from t0; ---- second read (same)
				update t0 set col0 = 1 where pk > 0; ---- also updated (2,3,5) to (2,1,5)
				select pk from t0; ---- third read (different!??)
				commit;

Forgot to attach my settings:

mysql> show variables like "%isolation%";                                                                                                    +-----------------------+-----------------+                                                                                                  | Variable_name         | Value           |                                                                                                  +-----------------------+-----------------+                                                                                                  | transaction_isolation | REPEATABLE-READ |                                                                                                  +-----------------------+-----------------+                                                                                                  1 row in set (0.00 sec)
                                                                                                                                                                                                                                                                mysql> show variables like "%autocommit%";                                                                                                   +---------------+-------+                                                                                                                    | Variable_name | Value |                                                                                                                    +---------------+-------+                                                                                                                    
| autocommit    | OFF   |
+---------------+-------+
1 row in set (0.01 sec)

mysql> \s
--------------
mysql  Ver 14.14 Distrib 5.7.31, for Linux (x86_64) using  EditLine wrapper

Connection id:          28199
Current database:       test
Current user:           zhangys@172.17.0.1
SSL:                    Cipher in use is ECDHE-RSA-AES128-GCM-SHA256
Current pager:          stdout
Using outfile:          ''
Using delimiter:        ;
Server version:         8.0.23 MySQL Community Server - GPL
Protocol version:       10
Connection:             172.17.0.2 via TCP/IP
Server characterset:    utf8mb4
Db     characterset:    utf8mb4
Client characterset:    utf8
Conn.  characterset:    utf8
TCP port:               3306
Uptime:                 4 days 9 hours 37 min 56 sec

Threads: 3  Questions: 5115326  Slow queries: 0  Opens: 232715  Flush tables: 3  Open tables: 297  Queries per second avg: 13.451

Hi Mr. ZHANG,

Thank you for your bug report.

I have tested your test case with both repeatable-read isolation level and with no auto-commit.

I have managed to repeat your results. It is possible that standard has changed meanwhile, but I am not aware of it. I repeated it with both 8.0.22 and 8.0.23.

Hence, I am verifying this bug as reported.

I do not understand why this was verified as this is not a bug.

If the update is done within the same transaction then this is completely expected behavior.

The snapshot is made at the start of the transaction, and remains "consistent" - that is, it reflects its state plus all change done within the transaction.

It is consistent because it remains isolated from any changes committed by any *other* transaction.

So if the update was done from another transaction, then the result would indicate inconsistency.

One could maybe say the documentation is not fully complete to explicitly state that changes in this transaction will be visible.

To expound on the comments made by Roland:
In a repeatable-read transaction, DML statements that lock rows always see the LATEST VERSION of a row.  And any rows modified by DML statements become visible to the transaction that modified them.  A subsequent transaction won't see those new values until the prior transaction commits.  If the subsequent transaction attempts to lock the rows modified by the prior uncommitted transaction (with a locking read or a data modification statement) that has modified them, they will block.  Non locking reads in the subsequent transaction will see the old version of the row (consistent with the start of the transaction snapshot) even after the original transaction commits, but if they attempt to update those rows (after commit of prior transaction, which will have released the locks) they will not block, and they will see the version of the rows as they exist after the update completes.

Some databases lock all rows read in a repeatable-read transaction, which prevents phantom reads, but InnoDB uses GAP LOCKS in order to prevent that from happening.

If one wants to avoid the above (completely sensible) behavior, one can choose to use the SERIALIZEABLE isolation level, which will convert all reads into SELECT .. LOCK IN SHARE MODE, similar to how those other databases treat REPEATABLE-READ.

Thus, you could say that writes update the snapshot, and writes see rows that have been committed by other transactions, because writes are not reads, and are not affected by snapshot isolation.  

Consider the alternative.  If the newest version of the row was not visible to an update, and another transaction updated an order from OPEN => CLOSED, and then you ran an update in another transaction to update all OPEN records, you would overwrite the changes from the prior transaction, and that would violate ACID.

This should, perhaps, be a documentation bug.  The documentation should note that CONSECUTIVE non-locking SELECT statements return consistent results.  Data changes made by DML that modifies data are reflected in result sets after that DML is executed.  That should be obvious, that changes made in a transaction are visible in that transaction.  What is perhaps not obvious is that since writes always see the most recent committed data, rows that are committed after the start of a transaction that would not have been visible earlier in the transaction may be modified by DML statements and those changed rows become visible to the transaction itself, even if they were not visible earlier in the transaction.  Writes always see the latest committed data.  Writes that attempt to modify uncommitted data are blocked by locks until the lock is released.

I also think it is not a bug. The root reason is the UPDATE operation is not a snapshot read to prevent the lost update problem. The UPDATE operation modifies the transaction id on the row in t2 so makes it visible in the third SELECT operation.

I think there are several points that should be clarified before further discussions.

I modified this test case from the example in Figure 4 from the paper [1]: Generalized Isolation Level Definitions, ICDE 2000.
In the paper, the author has a formal definition of the isolation levels of the ANSI/SQL-92 standard.

Point 1: What is the repeatable read?
The author called it PL2.99 in the paper. For this level, Phenomena G1, G2-item [2] should be disallowed.
As for an informal description of the PL 2.99 level in the new definition:
...T_i is completely isolated from other transactions with respect to data items and has PL-2 guarantees for predicate-based reads...
And for PL-2:
...T_i has only read the updates of transactions that have committed by the time T commits (along with PL-1 guarantees)
For PL-1:
...T_i's writes are completely isolated from the writes of other transactions
The important message in this test case’s context for PL-2.99:
...T_i is completely isolated from other transactions with respect to data items...

Point 2: Why the test case should be verified?
If you agree with the important message, then the test case is verified because `T_i is NOT completely isolated from other transactions with respect to data items`. For our test case, it means the `SELECT` results are inconsistent.

[1] Adya, A., Liskov, B., & O’Neil, P. (2000). Generalized isolation level definitions. Proceedings - International Conference on Data Engineering, (March), 67–78. https://doi.org/10.1109/icde.2000.839388

[2] H. Berenson, P. Bernstein, J. Gray, J. Melton, E. O’Neil, and P. O’Neil. A Critique of ANSI SQL Isolation Levels. In Proc. Of SIGMOD, San Jose, CA, May 1995

InnoDB uses MVCC modeled after Oracle.  DML other than non-locking select is ALWAYS READ-UNCOMMITTED.  If a write tries to modify UNCOMMITTED data, then the write blocks on a lock.  If the transaction holding the lock rolls back, the version of the row BEFORE the update is seen by the second transaction, if it was committed, then the new version is seen.  If a row was inserted then committed in transaction a, while transaction b is active, and transaction b updates the row, it becomes visible.

Repeatable-read is repeatable read after write.  If you don't have any writes in a transaction, all reads are consistent.  As soon as you update data, your transaction is repeatable-read consistent to you updates.  This is necessary because of CONSTRAINTS and for ACID compliance.

Consider:
create table t1(c1 int primary key, qty bigint) engine=innodb;

--trx1 (at time0)
start transaction;
insert into t1 values (1,1),(2,2),(3,1);

--trx2 (at time 1)
start transaction;
select * from t1;
  EMPTY SET
insert into t1 values (1,1) -- BLOCKS

--trx1 (at time 2)
rollback

--trx2 (at time 2 [it happens as soon as trx1 rolls back because lock is released]
1 ROW INSERTED

--trx3 (at time 3)
start transaction;
select * from t1;
  EMPTY SET

--trx2 (at time 4)
select * from t1;
  1,1
commit;

--trx3 (at time 5)
select * from t1;
  EMPTY SET
update t1 set qty = qty + 1;
  1 ROW AFFECTED
select * from t1;
  1,2

--trx4 (at time 6)
start transaction;
select * from t1
  EMPTY SET

--trx3 (at time 7)
commit;

--trx4 (at time 8)
select * from t1;
  EMPTY SET
update t1 set qty = qty + 1;
  1 ROW UPDATED
select * from t1;
  1,3
commit;

Again: writes are READ-UNCOMTTIED.  If they encounter data that is not committed yet, they block on locks UNLESS SKIP LOCKED is used.  Any rows that are updated get a new version with the transaction id of the transaction that wrote them.  Insertions on a row with a primary key will block.

Consider that the scenario above is changed slightly such that trx1 commits, instead of rolls back:
create table t1(c1 int primary key, qty bigint) engine=innodb;

--trx1 (at time0)
start transaction;
insert into t1 values (1,1),(2,2),(3,1);

--trx2 (at time 1)
start transaction;
select * from t1;
  EMPTY SET
insert into t1 values (1,1) -- BLOCKS

--trx1 (at time 2)
commit;

--trx2 (at time 2 [it happens as soon as trx1 rolls back because lock is released]
  DUPLICATE KEY ERROR!
select * from t1;
  EMPTY SET
insert into t1 values (2,7);
  DUPLICATE KEY ERROR
insert into t1 values (4,4);
  1 ROW INSERTED

--trx3 (at time 3)
start transaction;
update t1 set qty = qty +1;
  BLOCKS

--trx2
commit;

--trx3 (at time 3 because locks are released)
4 ROWS UPDATED!
select * from t1;
1,2
2,3
3,1
4,5

I wish one could edit comments.  The second example should say:
--trx2 (at time 4)
commit;

--trx3 (at time 4 because locks are released)
4 ROWS UPDATED!
select * from t1;
1,2
2,3
3,1
4,5

As for your comments about isolation, as demonstrated writes are isolated from READS until COMMIT happens.  They remain isolated from reads in another transaction as long as they are not UPDATED.  Repeateable-read is about READS not about WRITES.  Repeatable-read really means repeatable-read after write, which makes sense.  Constraints like primary key would produce duplicates otherwise in some cases, and ACID would break in other cases.

I really hope that is clear, and while the documentation may need to be updated, this is not a bug in repeatable-read.

To make the latter point clear:
-- trx1
start transaction;
select * from t1;
...
4,5

--trx 2
Select * from t1;
...
 4,5

--trx 1
update t1 set qty = qty + 1

-- trx2
select * from t1;
...
4,5
update t1 set qty = 0 where qty = 4
BLOCKS!

--trx1
commit;

--trx2
ZERO ROWS AFFECTED
commit;

If this behavior was not enforced, then the trx2 would update a row and the row it updated would change to qty 5 and the update from trx1 WOULD BE LOST and the commit would not be durable.

Sorry, I am tying to fast..  anyway
that should have said where set qty = 0 where qty = 5 

and it blocks...  if it did not block, then the qty=0 update would be lost if both transactions commit.

I hope this makes sense, and you can see the standard is not violated with respect to ISOLATED WRITES with respect to READS but WRITES ARE NEVER ISOLATED FROM EACH OTEHR and any rows that are updated become visible when they are changed.

If it is what is IMPLEMENTED in the InnoDB storage engine, the documentation is really misleading:https://dev.mysql.com/doc/refman/8.0/en/innodb-consistent-read.html

...The snapshot of the database state applies to SELECT statements within a transaction, not necessarily to DML statements. **If you insert or modify some rows and then commit that transaction, a DELETE or UPDATE statement issued from another concurrent REPEATABLE READ transaction could affect those just-committed rows, //even though the session could not query them.//** If a transaction does update or delete rows committed by a different transaction, those changes do become visible to the current transaction. For example, you might encounter a situation like the following:

SELECT COUNT(c1) FROM t1 WHERE c1 = 'xyz';
-- Returns 0: no rows match.
DELETE FROM t1 WHERE c1 = 'xyz';
-- Deletes several rows recently committed by other transaction.

SELECT COUNT(c2) FROM t1 WHERE c2 = 'abc';
-- Returns 0: no rows match.
UPDATE t1 SET c2 = 'cba' WHERE c2 = 'abc';
-- Affects 10 rows: another txn just committed 10 rows with 'abc' values.
SELECT COUNT(c2) FROM t1 WHERE c2 = 'cba';
-- Returns 10: this txn can now see the rows it just updated...

And this example used in the documentation is not clear enough to explain the case, because it only shows the second read with a new predicate which is not using the snapshot created by the first snapshot.

It is quite explicit.  It says exactly what I have said above, just more succinctly:
first about committed rows from other transactions being modifyable:
If you insert or modify some rows and then commit that transaction, a DELETE or UPDATE statement issued from another concurrent REPEATABLE READ transaction could affect those just-committed rows, even though the session could not query them.
Then it says once modified they become visible:
If a transaction does update or delete rows committed by a different transaction, those changes do become visible to the current transaction.

It even gives examples.  The first is a SELECT COUNT(*) that finds no rows, but it is assumed that another transaction committed rows into that able, so the subsequent DELETE actually deletes rows.

The second example, and update, uses the same empty table, then runs update against the empty table, and 10 rows are updated, because they have be committed into the table between the COUNT(*) and the UPDATE. While the second statement above clarifies that SELECT COUNT(*) will now return 10, it doesn't provide the explicit example.  Perhaps that text might be added to clarify things, as well as changing the wording above to say consecutive read dml.  It could also be noted that repeatable-read really means repeatable-read after writes.

Perhaps the phrase 'could not query them' is confusing you?  It means "could not see them in non-locking select query results because of the current database transaction snapshot, as they had not been committed when the transaction was started."

ahh:And this example used in the documentation is not clear enough to explain the case, because it only shows the second read with a new predicate which is not using the snapshot created by the first snapshot.

The update affected 10 rows.  It updated them from 'abc' to 'cba' without a where clause.  The count(*) query (I missed that) will return 10 rows, because the transaction changed all 10 rows (that were not visible at the start of the transaction) and now they all match the new predicate!  So the second example does show that the rows became visible because they were updated!

Given that you mentioned the predicate is different, you appear to also confused about the nature of the snapshot.  A snapshot is not created for each statement.  The snapshot is created when the transaction first accesses a table after START TRANSACTION is issued.  The snapshot is valid for all queries in that transaction, regardless of their predicates.  Changes made to rows during another transaction, which were not visible because of snapshot isolation, become visible and are "added to the snapshot' because each row has a transaction_id associated with it. When the row is updated, the newest version of the row had a transaction_id that is equal to the current transaction id. Snapshots that have to show consistent data (because the transaction_id is greater than the current transaction id,use old versions of rows stored in ROLLBACK segments.  It finds the visible version of the row in the rollback segment, which is the one with a transaction_id <= the current transaction_id.

Last comment: If you want different semantics, then you have to use LOCK TABLE [READ|WRITE].  That decreases concurrency but fully isolates reads from writes.

HI All,

Most likely, this will be converted to the documentation bug.

Hi guys,

This is an interesting discussion about transaction isolation implementation.

While I think this is not a bug *by product*, as there is no mismatch between InnoDB implementation and documentation.

There are 3 kinds of read in InnoDB:
1. Consistent nonlocking read, or snapshot read
2. Locking read, or current read
3. Semi-consistent (used only for udpate under Read Commited, we can ignore it in this discussion)

The doc https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html says very clear that update will use *locking reads*. So update can see/feel other concurrent transactions' updates.

This update behaviour is implementation choice, and vary between RDBMS product. For example, PostgreSQL choose a totally different path:

> UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the command start time.

https://www.postgresql.org/docs/13/transaction-iso.html#XACT-READ-COMMITTED

In PostgreSQL, the update also uses snapshot read to find target rows. So the newly inserted row by other concurrent transaction is invisible to current update.

This is how PostgreSQL works using the test case, I tested with current latest version (13.2):

=== prepare data ====

fungo=> create table t0 (
fungo(>   pk int primary key,
fungo(>   col0 int,
fungo(>   col1 int
fungo(> );
CREATE TABLE
fungo=> insert into t0 values (0,1,2), (1,2,3);
INSERT 0 2

==== transaction 1==

fungo=> begin transaction isolation level repeatable read;
BEGIN
fungo=*> select * from t0;
 pk | col0 | col1
----+------+------
  0 |    1 |    2
  1 |    2 |    3
(2 rows)

fungo=*> insert into t0 values (2,3,5);
INSERT 0 1
fungo=*> select * from t0;
 pk | col0 | col1
----+------+------
  0 |    1 |    2
  1 |    2 |    3
  2 |    3 |    5
(3 rows)

==== Then start transaction 2 =====

fungo=> begin transaction isolation level repeatable read;
BEGIN
fungo=*>  select * from t0;
 pk | col0 | col1
----+------+------
  0 |    1 |    2
  1 |    2 |    3
(2 rows)

fungo=*> update t0 set col0 = 1 where pk > 0; // there is no block, cause pk = 3 is invisble
UPDATE 1
fungo=*>  select * from t0;
 pk | col0 | col1
----+------+------
  0 |    1 |    2
  1 |    1 |    3
(2 rows)

And Oracle also use snapshot read (although there is some diffrence) for update, and will give similar result like PostgreSQL.

But whether this is a bug or not *by theory* or *by standard*, is not that clear... 
As the *ANSI/SQL-92 standard* for transaction isolation definfication is ambiguous, and leave too much space for vary implementations.

Locking reads are reads in read-committed where shared or exclusive locks may be set on row reads. 

SELECT ... LOCK IN SHARE MODE;
SELECT ... FOR UPDATE;

The first sets shared locks (read locks) and the second sets exclusive locks (write locks).  That is why the documentation specifically says 'non locking selects'.  

I swear there must be some great cosmic joke at play here.

Hi Justin,

> There are 3 kinds of read in InnoDB:
> 1. Consistent nonlocking read, or snapshot read
> 2. Locking read, or current read
> 3. Semi-consistent (used only for udpate under Read Commited, we can ignore it in this discussion)

> The doc https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html says very clear that update will use *locking reads*. So update can see/feel other concurrent transactions' updates.

I'm sorry for the above *locking reads* explaination, *locking read* in MySQL documentation is only for SELECT sql, as you said, thanks for correcting me :)

> says very clear that update will use *locking reads*. So update can see/feel other concurrent transactions' updates.

This is my mis-interpretation of the documentation :(

But the actual InnoDB implementation is like I explained, for *locking read* or update, MVCC is not taken into account, InnoDB behaves like there is only *one version* (the latest/current version).  So let's expand the definition *Locking read* to include update for a moment :)

And indeed, the current documentation need be maken more clear about this case.

Just came across an old bug report #102752, look like this isolation issue has been reported and discussed more than 10 years ago :)

I guess Kevin's (InnoDB developer) reply https://bugs.mysql.com/bug.php?id=57973#c403965 can also be applied to this bug report.

This could be thought as an official explanation for the InnoDB RR implementation regarding updates behavior.

BTW, the "WRITE COMMITTED version of REPEATABLE READ isolation level" statement is interesting :)

The bug num in my last reply should be #57973, got messed up with bug thread...