Bug #57973 UPDATE tries to lock rows not visible to read view
Submitted: 4 Nov 2010 11:37 Modified: 16 Aug 2013 19:25
Reporter: Stewart Smith Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: InnoDB Plugin storage engine Severity:S2 (Serious)
Version:1.0.7, 1.0.13 OS:Any
Assigned to: Kevin Lewis CPU Architecture:Any

[4 Nov 2010 11:37] Stewart Smith
Description:
an UPDATE statement in the default isolation level (repeatable read) will try to get locks on rows that aren't visible (and never will be) to the transaction it's in.

Included below is a mysqltest script. the last UPDATE statement will hit a lock wait timeout.

This seems to be an unintended side effect of the following fix:

revid:marko.makela@oracle.com-20100426102725-as2vc44ddykg1786
  
Authors: Marko Makela <marko.makela@oracle.com>
commit message:
  row_search_for_mysql(): Never try semi-consistent read in unique searches.
  They are only useful in table scans. (Bug #52663)

How to repeat:
connect (a,localhost,root,,);
connect (b,localhost,root,,);
connection a;
#set storage_engine=storage_engine_api_tester;
CREATE TABLE `t1` (id serial,intcol1 INT ,intcol2 INT ,charcol1 VARCHAR(128),charcol2 VARCHAR(128),charcol3 VARCHAR(128));
drop table t1;
set autocommit=0;
CREATE TABLE `t1` (id serial,intcol1 INT ,intcol2 INT ,charcol1 VARCHAR(128),charcol2 VARCHAR(128),charcol3 VARCHAR(128));
INSERT INTO t1 VALUES (NULL,1804289383,846930886,'xvtvmC9127qJNm06sGB8R92q2j7vTiiITRDGXM9ZLzkdekbWtmXKwZ2qG1llkRw5m9DHOFilEREk3q7oce8O3BEJC0woJsm6uzFAEynLH2xCsw1KQ1lT4zg9rdxBLb','97RGHZ65mNzkSrYT3zWoSbg9cNePQr1bzSk81qDgE4Oanw3rnPfGsBHSbnu1evTdFDe83ro9w4jjteQg4yoo9xHck3WNqzs54W5zEm92ikdRF48B2oz3m8gMBAl11W','y50w46i58Giekxik0cYzfA8BZBLADEg3JhzGfZDoqvQQk0Akcic7lcJInYSsf9wqin6LDC1vzJLkJXKn5onqOy04MTw1WksCYqPl2Jg2eteqOqTLfGCvE4zTZwWvgM');
INSERT INTO t1 VALUES (NULL,105575579,964445884,'DPh7kD1E6f4MMQk1ioopsoIIcoD83DD8Wu7689K6oHTAjD3Hts6lYGv8x9G0EL0k87q8G2ExJjz2o3KhnIJBbEJYFROTpO5pNvxgyBT9nSCbNO9AiKL9QYhi0x3hL9','WlwRHuWm4HE8leYmg66uGYIp6AnAr0BDd7YmuvYqCfqp9EbhKZRSymA4wx6gpHlJHI53DetH9j7Ixar90Jey5outd1ZIAJdJTjMaD7rMiqYXHFhHaB7Xr1HKuqe51G','GMrfw4egILWisfxPwOc3nJx4frnAwgI539kr5EXFbupSZelM2MHqZEmD6ZNuEZzHib8fqYuHQbdrDND8lXqIdcNbAeWOBLZlpZOX5AoNlQFzpK7QjxcLP0wbWIriYG');
INSERT INTO t1 VALUES (NULL,157272379,1104627321,'qksnug3YyANnWWDEJiRkiFC4a3e6KyJ2i3hSjksiuFLHlRXw9yhjDtnfoQd0OouyrcIbCB9zQWG4pf0yTZhaIT67nj7BY21FWJqaWrZxEh13Kt2hRbGl4MsrxsuLmv','d8FJid3GaHpRC2L6jgirPm5AW3uGGgCloJ5Ww0eNHSiLWvS5bAxto23AxxR6TXr9qofeoAtxWcJsXnxzxmsdhvoekFc5mSES8tyxvsuPK5Hjs7ihtaJaLz5xEh2s1G','CA22zxutF6rOqjXYHHzSrKRwAhWCPXTdhNXYKQIRO9sEkFf1YeTGqw40Ta5u6QNfpvC1DWTTXDkFSFHtte9bbDSwgZjmryHglLhqjAKEF4MkJfT49eXcjzZNOG1F6B');
INSERT INTO t1 VALUES (NULL,530788967,773446912,'YTi8X2EnycNH7sDHMltxcILtQE0ZPoPq9zyg24J0hiAgQNpg8jedtrWK5WtXIALR9B03FJ4ou6TCTAtWtN7fETzBzkiAmvTv6LrEZn2RtNfMaOkJfjytCp54ZfEJbb','7ZpqnMdipW5KkXdTjGCh2PNzLoeR0527frpQDQ8uw67Ydk1K06uuNHtkxYBxT5w8plb2BbpzhwYBgPNYX9RmICWGkZD6fAESvhMzH3yqzMtXoH4BQNylbK1CmEIPGY','lC6zBN3152Gza4GW7atxJKACYwJqDbFynLxqc0kh30YTwgz3FktQ43XTrqJ4PQ25frn7kXhfXD8RuzN1j8Rf3y8ugKy6es3IbqPJM6ylCyD6xS7YcQCfHKZxYNvB7y');
INSERT INTO t1 VALUES (NULL,1946710049,599529154,'hmu3lkoxjtvgLu5xKHSTTtJuGE5F5QqmCcppCTmvFZScRZQgim93gSxwb24gKmIPEzEQStMjQiCu7WapGbkw4ilXch3xRLMhKSzgLDOovSi2qGj6rKvnuYAWDDJgaZ','Du24bNIrBDBl81tjzdvuOpQRCXgX37xGtzLKEXBIcE3k7xK7aFtqxC99jqYnpTviK83bf6lGDgsKd4R3KLmHPnI8TqnIKj1gjw7N2sXFZNS2Svyg8cpZN7atxL39w4','igspvnaQuzhMt1IrZIJMkbLAKBNNKKK2sCknzI5uHeGAgQuDd5SLgpN0smODyc7qorTo1QaI5qLl97qmCIzl0Mds81x7TxpIoJyqlY0iEDRNKA1PS0AKEn5NhuMAr3');
INSERT INTO t1 VALUES (NULL,203845520,2143493320,'EIMsqMa5SuKo4M5OM7ldvisSc6WK9rsG9E8sSixocHdgfa5uiiNTGFxkDJ4EAwWC2e4NL1BpAgWiFRcp1zIH6F1BayPdmwphatwnmzdwgzWnQ6SRxmcvtd6JRYwEKd','vuWri8qB5MAR1caTvNFXB00HKShGa1uToGCMP2ZMHjpABFg6fnTioTz8pZGNQAkEJwckr9y0kfAvvq1QWM8IkzIZOaprdPxwMaY5bW5C977wy1MKNMe7EwZeWET7fR','v1tRyKpvYr3YntZ2DoGrwWfL91bW9Epw8iO6vDuR4xrkqOe3Dum1PPEQQpwRvmO3Kg2FtobqlQQRj4Woq8hFZvhrkWBX8bRSFLc09BiuT0L44jsvrZ2GKczwyAS6dv');
INSERT INTO t1 VALUES (NULL,1367743294,441971924,'GAlPDJBbKOfG7O3qp66Aa5yY8QelD4WL77eKIpneDJnCbq2Oo0QM6Pm6jOrxLDASCpgXWS1cnooMXjeJj5ahMCnxTXuqkxIxmyK8QLk52PJZ8ykjvAP9N46sbtiKq0','9SqTcHiLEkea9eOuLJweKJjQ3IJlW1dXTSs6dbJjJpHTRZFgkBJ4XuNQ4iCRchy51r3WQEEdvNyMxDvZHEeg0164bINBsM9l54GNBM06lrTa4G8LWe1Mf8I6IiA24J','o1FwOQIGZv7uM4AMHY0BmXJ4TTmfOiqCj68zYIGxDFjbJlpd9pgvCr0iKmDcT4o4Ao4ayCOCR7fftT8uAoqFfqprS2Sx68u8pyjnA7rT6N0rk8mmwc1RIiJmlfkjne');
select id from t1;
connection b;
set autocommit=0;
select id from t1;
update t1 set intcol1=42, intcol2=44 where id=7;

Suggested fix:
not try to get the locks.
[8 Nov 2010 18:12] Valeriy Kravchuk
Verified with current 5.1.53 from bzr (mysql-5.1 tree) with test case provided, as follows:

macbook-pro:mysql-test openxs$ ./mtr --mysqld=--ignore-builtin-innodb --mysqld=--plugin-load=innodb=ha_innodb_plugin.so --mysqld=--default-storage-engine=innodb bug57973
Logging: ./mtr  --mysqld=--ignore-builtin-innodb --mysqld=--plugin-load=innodb=ha_innodb_plugin.so --mysqld=--default-storage-engine=innodb bug57973
101108 20:07:51 [Warning] Setting lower_case_table_names=2 because file system for /var/folders/dX/dXCzvuSlHX4Op1g-o1jIWk+++TI/-Tmp-/tSp61kX06p/ is case insensitive
101108 20:07:51 [Note] Plugin 'FEDERATED' is disabled.
101108 20:07:51 [Note] Plugin 'ndbcluster' is disabled.
MySQL Version 5.1.53
Using default engine 'innodb'
Checking supported features...
 - skipping ndbcluster
 - SSL connections supported
 - binaries are debug compiled
Collecting tests...
vardir: /Users/openxs/dbs/5.1/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/Users/openxs/dbs/5.1/mysql-test/var'...
Installing system database...
Using server port 60617

==============================================================================

TEST                                      RESULT   TIME (ms)
------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
main.bug57973                            [ fail ]
        Test ended at 2010-11-08 20:08:46

CURRENT_TEST: main.bug57973
mysqltest: At line 29: query 'update t1 set intcol1=42, intcol2=44 where id=7' failed: 1205: Lock wait timeout exceeded; try restarting transaction

As one can see version 1.0.13 of plugin is used:

macbook-pro:mysql-test openxs$ cat var/log/mysqld.1.err 
CURRENT_TEST: main.bug57973
101108 21:07:54 [Warning] '--default-character-set' is deprecated and will be removed in a future release. Please use '--character-set-server' instead.
101108 21:07:54 [Warning] Setting lower_case_table_names=2 because file system for /Users/openxs/dbs/5.1/mysql-test/var/mysqld.1/data/ is case insensitive
101108 21:07:54 [Note] Plugin 'FEDERATED' is disabled.
101108 21:07:54 [Note] Plugin 'ndbcluster' is disabled.
InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins
InnoDB: Compressed tables use zlib 1.2.3
InnoDB: The first specified data file ./ibdata1 did not exist:
InnoDB: a new database to be created!
101108 21:07:54  InnoDB: Setting file ./ibdata1 size to 10 MB
InnoDB: Database physically writes the file full: wait...
101108 21:07:54  InnoDB: Log file ./ib_logfile0 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile0 size to 5 MB
InnoDB: Database physically writes the file full: wait...
101108 21:07:55  InnoDB: Log file ./ib_logfile1 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile1 size to 5 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Doublewrite buffer not found: creating new
InnoDB: Doublewrite buffer created
InnoDB: Creating foreign key constraint system tables
InnoDB: Foreign key constraint system tables created
101108 21:07:55 InnoDB Plugin 1.0.13 started; log sequence number 0
101108 21:07:55 [Note] Event Scheduler: Loaded 0 events
101108 21:07:55 [Note] /Users/openxs/dbs/5.1/libexec/mysqld: ready for connections.
Version: '5.1.53-debug-log'  socket: '/Users/openxs/dbs/5.1/mysql-test/var/tmp/mysqld.1.sock'  port: 13000  Source distribution
[10 Nov 2010 13:53] Marko Mäkelä
Sorry, what is the bug? The table has rows id=1 .. id=7. The table has a primary key on id (an implicit conversion caused by the 'serial' data type, it seems). One row matches the UPDATE condition. The two UPDATEs must lock out each other.
[10 Nov 2010 14:00] Marko Mäkelä
I tried a variation of the test, which looks correct as well:

-- table t1 is empty
connection 1> begin; update t1 set intcol1=42, intcol2=44 where id=7;
connection 2> begin; insert into t1 values(...);

Connection 2 would block, because connection 1 obtains a gap lock that happens to lock the entire table. (The upper end of the gap would be the successor of the record id=7, which is the B-tree root page supremum in this case, and the lower end of the gap would be the requested record (not inclusive), or its predecessor if the requested record is not found.)
[11 Nov 2010 12:05] Stewart Smith
with my sequence of events, the insert happens before, with the UPDATE starting after the rows have been inserted (but not committed).

so the UPDATE is "finding" rows (to lock, not to update) that it will never see in its read view (as they are not committed).

So you have decreased concurrency at least because the UPDATE will wait for locks for rows it will never see.
[16 Mar 2011 13:44] Marko Mäkelä
Stewart, sorry that I did not notice your reply to this bug.

There are two kinds of record locks in InnoDB. Newly inserted records are implicitly locked by the creating transaction. They are identified by the DB_TRX_ID field matching a transaction in trx_sys->trx_list, the list of active transactions. If another transaction waits for a record lock of an implictly locked transaction, it will create an explicit record lock struct for the creating transaction, and another explicit record lock for its wait request.

Explicit record locks in the InnoDB memory structures are identified by physical address (space_id,page_no,heap_no). These will move around when b-trees or b-tree pages are restructured.

I am not convinced that it would be OK to skip non-matching, non-committed records. We might already have something that applies in that case: semi-consistent reads (Bug #3300). We will have to think about this.
[30 Jan 2013 15:23] Justin Swanhart
In repeatable read this is necessary to protect the binary log.  If B is granted the lock on row seven, and both A and B commit, and A commits before B, then on the master the row will reflect the state of B and on the slave, the row will reflect the state B, because B's update will affect the row on the slave but was ignored on the master.
[30 Jan 2013 15:24] Justin Swanhart
I meant that on the master it will have the state of A and on the slave will have the state of B.
[16 Aug 2013 19:23] Kevin Lewis
Rejecting this bug because InnoDB is working as designed for the following reason;

This bug indicates that a record inserted and committed after a Repeatable Read transaction has started will never be seen by the open transaction. In other words, the bug report assumes that InnoDB Repeatable Read Isolation will not see PHANTOM records which are visible at one point and not at another.  That is not true for InnoDB which implements a sort of WRITE COMMITTED version of REPEATABLE READ isolation level.  

The ANSI standard defines the standard Isolation Levels in terms of phenomena: Dirty Reads, Non-Repeatable Reads, and Phantoms. The standard 4 isolation levels are often summarized with this chart;

Isolation Level   Dirty Read    Nonrepeatable Read  Phantom Read
================  ============  ==================  ==============
Read uncommitted  Possible      Possible            Possible
Read committed    Not possible  Possible            Possible
Repeatable read   Not possible  Not possible        Possible
Serializable      Not possible  Not possible        Not possible

InnoDB holds to this description for transactions that are just reading data. In fact, InnoDB is even more restrictive than the chart above allows for Repeatable Read transactions in that it will not show Phantom records while a transaction is just reading the database.  The transaction is truly a snapshot of the database at the time that the transaction is first registered with InnoDB.  A START TRANSACTION by itself does not register itself with InnoDB because MySQL does not know at that point which engine or tables will be used.  A START TRANSACTION WITH CONSISTENT SNAPSHOT will register the transaction immediately with InnoDB even though no tables have been touched so that the transaction start time can be set by the user. Any SELECT, INSERT, UPDATE or DELETE will also register the transaction with InnoDB.

But when InnoDB Repeatable Read transactions modify the database, it is possible to get phantom reads added into the static view of the database, just as the ANSI description allows.  Moreover, InnoDB relaxes the ANSI description for Repeatable Read isolation in that it will also allow non-repeatable reads during an UPDATE or DELETE.  Specifically, it will write to newly committed records within its read view.  And because of gap locking, it will actually wait on other transactions that have pending records that may become committed within its read view.  So not only is an UPDATE or DELETE affected by pending or newly committed records that satisfy the predicate, but also 'SELECT … LOCK IN SHARE MODE' and 'SELECT … FOR UPDATE'.

This WRITE COMMITTED implementation of REPEATABLE READ is not typical of any other database that I am aware of.  But it has some real advantages over a standard 'Snapshot' isolation.  When an update conflict would occur in other database engines that implement a snapshot isolation for Repeatable Read, an error message would typically say that you need to restart your transaction in order to see the current data. So the normal activity would be to restart the entire transaction and do the same changes over again.  But InnoDB allows you to just keep going with the current transaction by waiting on other records which might join your view of the data and including them on the fly when the UPDATE or DELETE is done.  This WRITE COMMITTED implementation combined with implicit record and gap locking actually adds a serializable component to Repeatable Read isolation.
[16 Aug 2013 19:24] Kevin Lewis
Consider two transactions that are shifting money into a checking account.  Each takes the money from separate accounts and adds it to a common account.  If they occur concurrently, most databases using Repeatable Read would require the second transaction that commits to abort and restart because the account balance being added to is different from what it was when the transaction started.  But with InnoDB, the newly committed record will be pulled into the read view when the account is added to.  This makes the two transactions apply serially without having to abort and restart a trnsaction.

### The following example illustrates a transaction that writes to non-repeatable/newly-changed records

mysql> create table checking (name char(20) key, balance int) engine InnoDB;
Query OK, 0 rows affected (0.03 sec)

mysql> insert into checking values ("Tom", 1000), ("Dick", 2000), ("John", 1500);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

Client #1                               Client #2
=====================================   =====================================
mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from checking;
+------+---------+
| name | balance |
+------+---------+
| Dick |    2000 |
| John |    1500 |
| Tom  |    1000 |
+------+---------+
3 rows in set (0.00 sec)

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> update checking
   set balance = balance - 250
   where name = "Dick";
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update checking
   set balance = balance + 250
   where name = "Tom";
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> select * from checking;
+------+---------+
| name | balance |
+------+---------+
| Dick |    1750 |
| John |    1500 |
| Tom  |    1250 |
+------+---------+
3 rows in set (0.02 sec)
                                        mysql> begin;
                                        Query OK, 0 rows affected (0.00 sec)

                                        mysql> select * from checking;
                                        +------+---------+
                                        | name | balance |
                                        +------+---------+
                                        | Dick |    2000 |
                                        | John |    1500 |
                                        | Tom  |    1000 |
                                        +------+---------+
                                        3 rows in set (0.00 sec)

                                        mysql> update checking
                                           set balance = balance - 200
                                           where name = "John";
                                        Query OK, 1 row affected (0.00 sec)
                                        Rows matched: 1  Changed: 1  Warnings: 0

                                        mysql> update checking
                                           set balance = balance + 200
                                           where name = "Tom";

                                        ### Client 2 waits on the locked record
mysql> commit;
Query OK, 0 rows affected (0.00 sec)
                                        Query OK, 1 row affected (19.34 sec)
                                        Rows matched: 1  Changed: 1  Warnings: 0
mysql> select * from checking;
+------+---------+
| name | balance |
+------+---------+
| Dick |    1750 |
| John |    1500 |
| Tom  |    1250 |
+------+---------+
3 rows in set (0.00 sec)
                                        mysql> select * from checking;
                                        +------+---------+
                                        | name | balance |
                                        +------+---------+
                                        | Dick |    2000 |
                                        | John |    1300 |
                                        | Tom  |    1450 |
                                        +------+---------+
                                        3 rows in set (0.00 sec)

                                        mysql> commit;
                                        Query OK, 0 rows affected (0.00 sec)

mysql> select * from checking;
+------+---------+
| name | balance |
+------+---------+
| Dick |    1750 |
| John |    1300 |
| Tom  |    1450 |
+------+---------+
3 rows in set (0.02 sec)

### Now take a tax from each account - This illustrates writing to phantoms.

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from checking;
+------+---------+
| name | balance |
+------+---------+
| Dick |    1750 |
| John |    1300 |
| Tom  |    1450 |
+------+---------+
3 rows in set (0.00 sec)
                                        ### Add a new account;
                                        mysql> insert into checking values ("Harry", 1000);
                                        Query OK, 1 row affected (0.00 sec)

                                        mysql> select * from checking;
                                        +-------+---------+
                                        | name  | balance |
                                        +-------+---------+
                                        | Dick  |    1750 |
                                        | Harry |    1000 |
                                        | John  |    1300 |
                                        | Tom   |    1450 |
                                        +-------+---------+
                                        4 rows in set (0.00 sec)

mysql> select * from checking for update;
+-------+---------+
| name  | balance |
+-------+---------+
| Dick  |    1750 |
| Harry |    1000 |
| John  |    1300 |
| Tom   |    1450 |
+-------+---------+
4 rows in set (0.00 sec)

mysql> select * from checking;
+------+---------+
| name | balance |
+------+---------+
| Dick |    1750 |
| John |    1300 |
| Tom  |    1450 |
+------+---------+
3 rows in set (0.00 sec)

mysql> update checking set balance = balance - 111;
Query OK, 4 rows affected (0.00 sec)
Rows matched: 4  Changed: 4  Warnings: 0

mysql> select * from checking;
+-------+---------+
| name  | balance |
+-------+---------+
| Dick  |    1639 |
| Harry |     889 |
| John  |    1189 |
| Tom   |    1339 |
+-------+---------+
4 rows in set (0.01 sec)

mysql> commit;
Query OK, 0 rows affected (0.00 sec)
                                        
The example above shows an anomaly in that the 'select * from checking' continues to show the snapshot view even when 'select * from checking for update" shows the phantom record that is also locked and ready to be updated.  After the update, the phantom has become part of the static view.
[16 Aug 2013 20:55] MySQL Verification Team
Kevin, please anaylze http://bugs.mysql.com/bug.php?id=69979 (starting at last testcase).
[19 Aug 2013 17:37] Kevin Lewis
This WRITE_COMMITTED feature to InnoDB's REPEATABLE READ does not have anything to do with bug#69979.  That is a timing related wrong behavior of REPLACE when there are multiple unique fields.
[4 Mar 2021 14:27] Fungo Wang
Just came across an old bug report #102752, look like this isolation issue has been reported and discussed more than 10 years ago :)

I guess Kevin's (InnoDB developer) reply https://bugs.mysql.com/bug.php?id=57973#c403965 can also be applied to this bug report.

This could be thought as an official explanation for the InnoDB RR implementation regarding updates behavior.

BTW, the "WRITE COMMITTED version of REPEATABLE READ isolation level" statement is interesting :)
[4 Mar 2021 14:30] Fungo Wang
Sorry, pls ignore my last reply, post to wrong bug thread, should be bug #102752...