Bug #66819 | InnoDB change buffering is not entirely crash-safe | ||
---|---|---|---|
Submitted: | 14 Sep 2012 13:37 | Modified: | 16 Oct 2012 23:33 |
Reporter: | Alexey Kopytov | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S3 (Non-critical) |
Version: | 5.1, 5.5, 5.6 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[14 Sep 2012 13:37]
Alexey Kopytov
[15 Sep 2012 10:59]
Alexey Kopytov
In fact for an IBUF_OP_DELETE operation ibuf_merge_or_delete_for_page() commits mtr even before calling ibuf_delete_rec(): /* Because ibuf_delete() will latch an insert buffer bitmap page, commit mtr before latching any further pages. Store and restore the cursor position. */ ut_ad(rec == btr_pcur_get_rec(&pcur)); ut_ad(page_rec_is_user_rec(rec)); ut_ad(ibuf_rec_get_page_no(&mtr, rec) == page_no); ut_ad(ibuf_rec_get_space(&mtr, rec) == space); btr_pcur_store_position(&pcur, &mtr); ibuf_btr_pcur_commit_specify_mtr(&pcur, &mtr); So in this case ibuf_delete_rec() does not even have to resort to pessimistic removal to trigger the problem.
[17 Sep 2012 16:28]
Marko Mäkelä
Thank you for the observation! It seems to me that MySQL versions before 5.5 should be safe, because they only buffer inserts. Buffered inserts can be applied in an arbitrary order for a given page, and the merge operation is idempotent, because we do not check for uniqueness violations. If the record is already in the index page, we simply discard the buffered record. Delete-buffering, which was introduced in 5.5, changed this. We must apply the operations in the original order, and we must ensure that the page does not become empty as a result of purging records from the page. This emptiness check can obviously be broken if the server is killed after applying some buffered operations. There could be an easy fix that involves flagging the processed change buffer records in the same mini-transaction that changed the secondary index.
[17 Sep 2012 16:59]
Alexey Kopytov
Marko, >It seems to me that MySQL versions before 5.5 should be safe, because they only buffer inserts. Buffered inserts can be applied in an arbitrary order for a given page, and the merge operation is idempotent, because we do not check for uniqueness violations. If the record is already in the index page, we simply discard the buffered record. I'm not sure, there's the same debug assertion in ibuf_insert_to_index_page() in 5.0 and 5.1: ut_ad(rec_get_deleted_flag(rec, page_is_comp(page))); I.e. if the record to be merged is already on the page, we assert it must have been deleted previously. And it's the same assertion which fails in debug 5.5 and 5.6 builds in our tests. Anyway, even if versions before 5.5 discard duplicate buffered inserts in release builds, the main problem here is corruption rather than a failing assertion triggered by that corruption. And it looks like versions before 5.5 are vulnerable to this corruption as well. Most of the similar bug reports mention a recent upgrade from 5.1 to 5.5. Indeed, the corruption might be hidden in 5.1 due to duplicate buffered records being discarded, and revealed after an upgrade to 5.5.
[18 Sep 2012 10:14]
Marko Mäkelä
I closed Bug#65443 as a duplicate of this one. This bug should also cover most (if not all) cases of Bug#58751. This may be the root cause of Bug#61104. Thanks Alexey, I will try to see if I can repeat the failure without delete-buffering. If I can, that should explain the 5.1 failure that was reported in Bug#58751.
[18 Sep 2012 10:31]
Marko Mäkelä
Alexey, you are right. When fixing Bug#56680, I added the delete-mark debug assertion for the insert merge. I was able to trigger that assertion failure with a modified test, on an instrumented server.
[8 Oct 2012 19:09]
Marko Mäkelä
Sorry, I meant that Bug#58571 is a duplicate of this one, not Bug#58751.
[16 Oct 2012 23:33]
John Russell
Added to changelog for 5.1.67, 5.5.29, 5.6.8, 5.7.0: If the server crashed at the moment specific point when a change buffer entry was being merged into a buffer pool page, the transaction log and the change buffer were left in an inconsistent state. After a restart, MySQL could crash after reading the corresponding secondary index page. The problem was more likely to occur in MySQL 5.5 or later, where the original insert buffering mechanism was generalized to cover other operations.
[17 Jan 2013 7:49]
Valeriy Kravchuk
Based on code review I wonder how this fix takes into account the problem explained in the following comment: "[15 Sep 2012 10:59] Alexey Kopytov" From what I see, the code is not changed in this part.
[17 Jan 2013 13:01]
Marko Mäkelä
Valeriy, you are right. Sorry, my bad. When we are merging a buffered purge operation (IBUF_OP_DELETE), we must commit the mini-transaction before we can actually delete the just-applied IBUF_OP_DELETE record. Before committing the mini-transaction, we should flag the IBUF_OP_DELETE record processed. This remaining bug affects purge buffering only. The original fix fixed problems in insert buffering and delete-mark buffering.
[20 Apr 2013 13:59]
Laurynas Biveinis
5.5$ bzr log -r 4177 ------------------------------------------------------------ revno: 4177 committer: Satya Bodapati <satya.bodapati@oracle.com> branch nick: mysql-5.5 timestamp: Mon 2013-01-28 20:13:44 +0530 message: Bug#16183892 - INNODB PURGE BUFFERING IS NOT CRASH-SAFE With innodb_change_buffering enabled, Innodb buffers all modifications to secondary index leaf pages when the leaf pages are not in buffer pool. Crash InnoDB while an IBUF_OP_DELETE is being applied. Restart and note that the same record can be applied again which may lead to crash. Mark the change buffer record processed, so that it will not be merged again in case the server crashes between the following mtr_commit() and the subsequent mtr_commit() of deleting the change buffer record. Testcase: No testcase because it is difficult to get the timing right with the two asyncronous task purge and change buffering Approved by Marko. rb#1893
[24 Apr 2013 17:15]
Frank Zhou
Our DBs crash very often recently Error like The error log shows similar error: 130415 13:05:30 InnoDB: Assertion failure in thread 139661618816768 in file ibuf0ibuf.c line 4124 InnoDB: Failing assertion: page_get_n_recs(page) > 1 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 20:05:30 UTC - mysqld got signal 6 ; We are using 5.5.20, change_buffering = all. Is there any patch we can use to prevent the crash? Thanks!