Bug #57232 | Possible corruption during UNDO phase at startup after crash recovery | ||
---|---|---|---|
Submitted: | 4 Oct 2010 22:44 | Modified: | 30 Nov 2010 23:38 |
Reporter: | Sunny Bains | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S1 (Critical) |
Version: | All | OS: | Any |
Assigned to: | Sunny Bains | CPU Architecture: | Any |
[4 Oct 2010 22:44]
Sunny Bains
[4 Oct 2010 22:49]
Sunny Bains
This could have an impact on how we fix bug# 56655. This bug will only affect users if they connect to the server and start DML/DDL before the UNDO phase of recovered transactions is completed..
[5 Oct 2010 5:38]
Marko Mäkelä
A simpler fix: While trx_rollback_or_clean_recovered() is running, defer all DROP TABLE and TRUNCATE TABLE to the background queue. Disadvantage: If the server was taken offline because of running out of disk space during a bulk operation, the DBA would be unable to reclaim disk space until after the bulk operation has been rolled back. The record-lock reference counting idea looks good, provided that it will not cause contention. One idea is to protect the reference count by atomic memory access primitives instead of a mutex or rw-lock.
[5 Oct 2010 5:44]
Sunny Bains
I don't think TRUNCATE is affected by this because we reuse the dict_table_t* instance but drop and create a new tablespace. How can the rec lock count cause contention? It will simply be protected by the lock/kernel mutex. I see no reason to use atomics. We increment when we add a record lock to the hash table and decrement when we remove from the hash table. We already have the lock/kernel mutex for both operations.
[20 Oct 2010 10:35]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/121290
[20 Oct 2010 10:35]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/121291
[20 Oct 2010 12:22]
Marko Mäkelä
Sorry, my two commits are not fixing this bug, but fixing bugs in the patch that Sunny committed earlier. Sunny's patch was primarily addressing Bug #20877.
[13 Nov 2010 16:22]
Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:alexander.nozdrin@oracle.com-20101113152450-2zzcm50e7i4j35v7) (merge vers: 5.6.1-m4) (pib:21)
[13 Nov 2010 16:30]
Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:alexander.nozdrin@oracle.com-20101113152540-gxro4g0v29l27f5x) (pib:21)
[16 Dec 2010 22:26]
Bugs System
Pushed into mysql-5.5 5.5.9 (revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (version source revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (merge vers: 5.5.9) (pib:24)
[23 Dec 2010 11:41]
James Day
What is the text of the assertion error message? Something that we can see and use to recognise this bug as the fixed cause.
[10 Jan 2011 23:27]
Sunny Bains
James, it is this code in 5.6 - row_drop_table_for_mysql(). if (table->n_ref_count == 0) { lock_remove_all_on_table(table, TRUE); ut_a(table->n_rec_locks == 0);