Bug #31310 | Locked rows silently skipped in read-committed isolation level | ||
---|---|---|---|
Submitted: | 1 Oct 2007 9:46 | Modified: | 31 Oct 2007 0:02 |
Reporter: | Domas Mituzas | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server | Severity: | S1 (Critical) |
Version: | 5.1-bk,5.1.22,5.1.20 | OS: | Any |
Assigned to: | Gleb Shchepa | CPU Architecture: | Any |
Tags: | bfsm_2007_10_18, corruption, regression |
[1 Oct 2007 9:46]
Domas Mituzas
[1 Oct 2007 11:15]
Heikki Tuuri
This is a critical bug. In 5.1, READ COMMITTED removes most of the gap locking and locking of nonmatching rows if there is a WHERE condition. But it should not skip locked rows!
[1 Oct 2007 11:46]
Daniel Fischer
Also verified with 5.1.22.
[1 Oct 2007 12:47]
Heikki Tuuri
I can repeat on the latest 5.1.23/InnoDB-zip.
[1 Oct 2007 13:05]
Domas Mituzas
As asked by Jeffrey, repeated on 5.1.20
[1 Oct 2007 14:30]
Heikki Tuuri
The idea in a 'semi-consistent read' is that if a row is locked, then InnoDB returns the latest committed version, so that MySQL can decide if the row matches the WHERE condition in the UPDATE. sql_update.cc: " while (!(error=info.read_record(&info)) && !thd->killed) { if (!(select && select->skip_record())) { if (table->file->was_semi_consistent_read()) continue; /* repeat the read of the same row if it still exists */ " If the row matches the WHERE condition, MySQL reads the same row again, and InnoDB this time locks it, or waits for a lock on the row. The bug here is that MySQL's read_multi_range_next() keeps its own information about the cursor position, and refuses to read the row again because it thinks that the cursor is already at the end of the potential result set. Breakpoint 3, QUICK_RANGE_SELECT::get_next (this=0x198c7a0) at opt_range.cc:8262 8262 if (in_ror_merged_scan) Current language: auto; currently c++ (gdb) next 8273 if (in_range) (gdb) 8276 result= file->read_multi_range_next(&mrange); (gdb) step handler::read_multi_range_next (this=0x19b2d28, found_range_p=0x440c8280) at handler.cc:3130 3130 DBUG_ENTER("handler::read_multi_range_next"); (gdb) next 3133 DBUG_ASSERT(multi_range_curr < multi_range_end); (gdb) 3138 if (multi_range_curr->range_flag != (UNIQUE_RANGE | EQ_RANGE)) (gdb) 3152 result= HA_ERR_END_OF_FILE; (gdb) next 3156 for (multi_range_curr++; (gdb) 3135 do (gdb) 3173 *found_range_p= multi_range_curr; (gdb) 3174 DBUG_PRINT("exit",("handler::read_multi_range_next: result %d", result)); (gdb) 3175 DBUG_RETURN(result); (gdb) 3176 } (gdb) QUICK_RANGE_SELECT::get_next (this=0x198c7a0) at opt_range.cc:8277 8277 if (result != HA_ERR_END_OF_FILE) (gdb) 8281 uint count= min(multi_range_length, ranges.elements - (gdb) 8283 if (count == 0) (gdb) 8286 in_range= FALSE; (gdb) 8287 if (in_ror_merged_scan) (gdb) 8289 DBUG_RETURN(HA_ERR_END_OF_FILE); (gdb) How to fix this: MySQL code should be aware that it may need read the same row twice, even in a unique equality search on an index. A quick fix is to disable InnoDB's semi-consistent read optimization in the unique equality search. I do not know yet if the same bug exists for a non-unique range search.
[1 Oct 2007 14:43]
Heikki Tuuri
Classifying this as a MySQL Server bug. We want to preserve the 'semi-consistent' optimization of UPDATE ... WHERE ... MySQL code must be aware it may need to re-read the same row twice, also in a unique equality search on an index. opt_range.cc and other relevant files should be checked for this assumption. Or, could we fix this inside InnoDB? That is very hard because it may happen that the last committed row version that we returned to MySQL is already obsolete. We really need to re-read the row to know if it matches the WHERE condition.
[1 Oct 2007 14:54]
Heikki Tuuri
This code in handler.cc assumes that the same row cannot be read twice: int handler::read_multi_range_next(KEY_MULTI_RANGE **found_range_p) { int result; DBUG_ENTER("handler::read_multi_range_next"); /* We should not be called after the last call returned EOF. */ DBUG_ASSERT(multi_range_curr < multi_range_end); do { /* Save a call if there can be only one row in range. */ if (multi_range_curr->range_flag != (UNIQUE_RANGE | EQ_RANGE)) { result= read_range_next(); /* On success or non-EOF errors jump to the end. */ if (result != HA_ERR_END_OF_FILE) break; } else { /* We need to set this for the last range only, but checking this condition is more expensive than just setting the result code. */ result= HA_ERR_END_OF_FILE; }
[1 Oct 2007 20:55]
Sergey Petrunya
> If the row matches the WHERE condition, MySQL reads the same row again, > and InnoDB this time locks it, or waits for a lock on the row. A question from someone seeing all this for the first time: why is the API designed so we need to "read the same row again" and not have a separate call handler->lock_row() that would mean "the last returned row has passed the WHERE clause, please lock it"? The "read the same row again" solution seems like making one API call serve two different purposes?
[1 Oct 2007 22:22]
Sergey Petrunya
Whoever merges this into 5.2 will need to check how this can be made to work with DS-MRR, or verify that DS-MRR is disabled for such UPDATE queries
[3 Oct 2007 15:52]
Heikki Tuuri
Sergey, " A question from someone seeing all this for the first time: why is the API designed so we need to "read the same row again" and not have a separate call handler->lock_row() that would mean "the last returned row has passed the WHERE clause, please lock it"? The "read the same row again" solution seems like making one API call serve two different purposes? " the returned row is the last committed version. It may have changed by the time we decide to lock the row. Therefore, it is not enough just to lock the cursor row. The row also has to be refetched so that the MySQL interpreter can test the WHERE conditions. Regards, Heikki
[5 Oct 2007 18:26]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/35009 ChangeSet@1.2580, 2007-10-05 23:27:12+05:00, gshchepa@gleb.loc +3 -0 Fixed bug #31310. Locked rows of the InnoDB storage was silently skipped in the read-committed isolation level. QUICK_RANGE_SELECT lacks second (blocking) read of record that was read semi-consistently and just skip it. The handler::read_multi_range_next method has been modified to retry previous range if the previous read was semi-consistent.
[8 Oct 2007 12:07]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/35095 ChangeSet@1.2580, 2007-10-08 17:09:38+05:00, gshchepa@gleb.loc +3 -0 Fixed bug #31310. Locked rows of the InnoDB storage was silently skipped in the read-committed isolation level. QUICK_RANGE_SELECT lacks second (blocking) read of record that was read semi-consistently and just skip it. The handler::read_multi_range_next method has been modified to retry reading of the previous record after the semi-consistent read.
[8 Oct 2007 20:04]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/35139 ChangeSet@1.2587, 2007-10-09 01:07:15+05:00, gshchepa@gleb.loc +3 -0 Fixed bug #31310. Locked rows of the InnoDB storage was silently skipped in the read-committed isolation level. QUICK_RANGE_SELECT for unique ranges lacks second (blocking) read of the record that was read semi-consistently and just skip it. The handler::read_multi_range_next method has been modified to retry previous unique range if the previous read was semi-consistent.
[8 Oct 2007 20:24]
Gleb Shchepa
We use ha_innobase::index_read() with previously used key to re-read the record after the semi-consistent read of this record.
[29 Oct 2007 8:46]
Bugs System
Pushed into 5.1.23-beta
[29 Oct 2007 8:50]
Bugs System
Pushed into 6.0.4-alpha
[31 Oct 2007 0:02]
Paul DuBois
Noted in 5.1.23, 6.0.4 changelogs. For InnoDB tables with READ COMMITTED isolation level, UPDATE statements skipped rows locked by another transaction, rather than waiting for the locks to be released.
[17 Dec 2007 16:43]
Susanne Ebrecht
Bug #33282 is set as duplicate of these bug here.