Bug #102913 REPEATABLE-READ: UPDATE EVENTS NOT MEET 'CURRENT-READ' RULE
Submitted: 11 Mar 2021 3:06 Modified: 12 Mar 2021 4:29
Reporter: Mips Lee Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0.18 OS:Any
Assigned to: CPU Architecture:Any
Tags: REPEATABLE-READ TRANSACTION

[11 Mar 2021 3:06] Mips Lee
Description:
Mysql is at the 'REPEATABLE-READ' isolation level: 
It is no problem to meet the read snapshots when it is not update or not lock (lock in share mode / for update);  if the committed transaction is started later than the current session transaction,   in the case of update and lock,  if you update to the committed row, you must follow the 'current-read' rules, but from the above example, if my update operation does not actually change the data row, it will only lock meeting condition rows at this time, but the current reading rules will not be met. In this case, unexpected data problems may occur in some cases, so I believe it is a bug.

How to repeat:
CREATE TABLE `xiaopangtest` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(30) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '',
  `age` int(10) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

INSERT INTO `xiaopangtest` (`id`, `name`, `age`)
VALUES (1, 'this is a name', 1);

# This table only has one row data : (1,'this is a name',0) corresponding column: (id(pk),name,age)

#session A.  Suppose the id of this transaction A started is 100
set session transaction_isolation = 'repeatable-read';

start transaction with consistent snapshot; # begin now

select name from xiaopangtest where id = 1; # result: this is a name 

#------- session B(tx_id = 103) auto-commit=1, execute SQL 'update xiaopangtest set name = 'is there a bug?' where id =1;', we can see update column `name` value -> 'is there a bug?'

select name from xiaopangtest where id = 1; # result: this is a name, expected.

update xiaopangtest set age = 0 where id = 1; # did not actually change the data, add a row-lock where id =1, expected.

select name from xiaopangtest where id = 1; # result: 'this is a name', unexpected!  Shouldn't the update operation be the 'current-read' at the repeatable-read isolation level? so I think it's should be gotted

update xiaopangtest set age = 1 where id = 1; # actually change the data, add a row-lock

select name from xiaopangtest where id = 1; # result: is there a bug? expected. Meet 'current-read' rule.

commit;

Suggested fix:
I suggest that even if the value of the data row has not actually changed under the update statement, the current read rules should be met, because you have already added row locks to the data.
[11 Mar 2021 13:27] MySQL Verification Team
Hi Mr. Lee,

Thank you for your bug report.

Your report is a duplicate of the following one:

https://bugs.mysql.com/bug.php?id=102752

If you disagree with this, then first, test with our latest 8.0 release and then explain your problem in much more clear wording and full test case.

Duplicate.
[11 Mar 2021 14:23] Mips Lee
I don't think it's similar to "https://bugs.mysql.com/bug.php?id=102752". My core opinion is :
The update statement updates a row of data during the update process, but since the updated value is equal to the current value, the number of affected rows is 0. But row locks still exist, so I think this is a semantic error.
[11 Mar 2021 14:38] MySQL Verification Team
Hi,

This is expected behaviour.

Row lock MUST exist, since it is taken before reading the row.

Not a bug.
[11 Mar 2021 15:11] Mips Lee
Hi friends, I understand what you mean. I didn't express clearly what I meant just now. At the repeatable-read level, each unlocked select statement determines the data rows visible in the current view based on the transaction id array at startup. What I want to express is: when update sql is updating met condition rows, it will add row locks, read the newest row-transaction-id version data, and update them. But in my example above, sessionA starts first, sessionB After starting, and updating the  field  "name"  by id=1 result : affected rows 1 , and finally submitting the transaction, then sessionA updated the age field of the row data according to the condition of id=1, but because the updated value is the same as the old value, SessionA The update result shows affected 0 rows. At this time, sessionA has the row lock of row id=1, so I think in the next select statement of sessionA, query the row with id=1 again, it should be able to see the modified name filed value of sessionB  submitted, but in fact it is not. If sessionA updates the age field, the updated value is different from the old value. When affected rows =1, the select statement can query the updated name value of sessionB. This is in line with our expectations, but when affected 0 rows, it does not meet my expectations~
[11 Mar 2021 17:57] Yang Sk
Tested with mysql-8.0.23

How to reproduce:

create table t(
pk int primary key,
col0 int,
col1 int
);

insert into t values (1,1,1);

-- T1 --			-- T2 --

begin;		
select * from t where pk=1;
				begin;	
				select * from t where pk=1; -- 1 1 1	
update t set col0=2 where pk=1;		
				update t set col1=1 where pk=1;
commit;
				select * from t where pk=1; -- 1 1 1
				commit;
				select * from t where pk=1; -- 1 2 1    

After T1 commits, the update statement made by T2 holds the lock on pk=1 and update(0 rows affected), but can't see the LATEST VERSION of the row(col0 = 2)

Mentioned by Justin Swanhart in https://bugs.mysql.com/bug.php?id=102752 "In a repeatable-read transaction, DML statements that lock rows always see the LATEST VERSION of a row.  And any rows modified by DML statements become visible to the transaction that modified them.". 

If what he said is accurate, T2 should see the LATEST VERSION of the row where col0 = 2, right?
[12 Mar 2021 2:02] Sunny Bains
REPEATABLE-READ is one of those things where the exact semantics are mostly implementation dependent. IInnoDB itself has historically claimed its REPEATABLE-READ as "OracleDB like". That's about as precise as it gets at the high level.

What Justin describes is correct but there is an exception. The exception is due to the mechanics under the hood.

 1. Server first does a read of the row from the table
 2. Server then updates the columns in the row and sends an update ro request to the SE (InnoDB)
 3. InnoDB then calculates the difference in the columns and creates an update delta, in the case for T2 it's empty.
 4. InnoDB then locks the table i IX mode etc. etc.
 5. There is nothing to apply in the delta and so it returns (it doesn't re-read the latest contents of the row).
 
In the following select before the commit you will see the same data because InnoDB hasn't actually read the latest data in step #5.
[12 Mar 2021 4:29] Mips Lee
What Sunny Bains said made me understand :

There is a mechanism of "creates an update delta" when InnoDB is updating.

But because of this mechanism, it does not meet our expectations when the number of updated rows is 0. And due to past experience, we thought we could read the latest submitted row.
[15 Mar 2021 6:04] Debarun Banerjee
Looks like the confusion is about

"locking the row"  Vs "updating the row"

[1] doesn't necessarily imply [2] and in the bug example it only acquires lock and that is not enough for the row to be visible in read-only statements in RR isolation currently in Innodb. 

It is quite inline with the comments from other bug ...

"Mentioned by Justin Swanhart in https://bugs.mysql.com/bug.php?id=102752 ..."

In a repeatable-read transaction, 

1. DML statements that lock rows always see the LATEST VERSION of a row.  

2. Any rows modified by DML statements become visible to the transaction that modified them.

[1] is not satisfied in this case because the statement in question is "read-only" and doesn't lock the row.

[2] is also not satisfied because the row is not modified.

The root cause is the base anomaly of our RR isolation that the "read-only" and "read-write" statements could read different data. e.g. "select for update" Vs "select".

--------------------------------------------------------------
mysql> create table t2(col1 int primary key, col2 int);
mysql> insert into t2 values(1, 10);

Trx-1: 
mysql>begin;
mysql> select * from t2;     // Creates the snapshot
+------+------+
| col1 | col2 |
+------+------+
|    1 |   10 |
+------+------+
1 row in set (0.00 sec)

Trx-2
------
mysql> update t2 set col2 = 100 where col1 = 1;  // 1,100

Trx-1
------
mysql> select * from t2 for update; // Locks the row and see latest
+------+------+
| col1 | col2 |
+------+------+
|    1 |  100 |
+------+------+
1 row in set (0.00 sec)

mysql> select * from t2;  // Still reads old row version although 
                          // the row is locked. 
+------+------+
| col1 | col2 |
+------+------+
|    1 |   10 |
+------+------+
1 row in set (0.00 sec)

mysql> update t2 set col2 = 200 where col1 = 1;  // 1,200

mysql> select * from t2;  // Sees the latest change as the row
                          // is modified by current transaction.
+------+------+
| col1 | col2 |
+------+------+
|    1 |  200 |
+------+------+
1 row in set (0.00 sec)