Bug #70087 | InnoDB can not use the doublewrite buffer properly | ||
---|---|---|---|
Submitted: | 20 Aug 2013 2:11 | Modified: | 19 Dec 2013 17:50 |
Reporter: | Nizameddin Ordulu | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
Version: | 5.6 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | doublewrite, recovery |
[20 Aug 2013 2:11]
Nizameddin Ordulu
[20 Aug 2013 17:58]
MySQL Verification Team
Careful analysis of the code clearly shows that if necessary conditions are met that fil_validate_single_table_tablespace() function is called during the execution of fil_load_single_table_tablespace() function, then, the contents of the double-write buffer will not be used to recover a table. This would have been a feature request, were it not for the recovery of data after crash. As durability is very important attribute of the InnoDB Storage Engine, this, therefore is a bug. I also think that the title of the bug should be finished with " for the recovery".
[20 Aug 2013 19:23]
Marko Mäkelä
This sounds plausible. I think that this should be repeatable with a workload like this: CREATE TABLE t(b BLOB)ENGINE=InnoDB; BEGIN; INSERT INTO t VALUES(REPEAT('blob ',12345)); ROLLBACK; BEGIN; INSERT INTO t VALUES(REPEAT('blob ',12345)); ROLLBACK; ... and killing the server during the workload in such a way that a torn write to page 0 of the *.ibd file happens. (Alternatively, you can artificially corrupt the first page and see if it can be recovered from the doublewrite buffer.) The first page of each InnoDB tablespace holds the allocation bitmap for pages 0 through page_size-1. The bitmap would be updated for allocating and freeing the BLOB pages in the above SQL. Earlier this year, I noticed that some access to the first page of the *.ibd file was ignoring the page checksum. On checksum mismatch, we should really always try to recover the page from the doublewrite buffer if it is available.
[21 Aug 2013 12:08]
MySQL Verification Team
Marko, Thank you for confirming my verification and for additional comments which will help considerably the improvement of recovery procedures.
[26 Aug 2013 7:13]
Guangpu Feng
Is this a duplicate of Bug#69623
[26 Aug 2013 7:23]
Guangpu Feng
I want to know from which version this bug was intruduced. I have checked the codes of 5.5.18, which is the version we used in production, and find no functions named *fil_validate_single_table_tablespace*, does it mean that 5.5.18 is safe from this bug?
[26 Aug 2013 13:27]
MySQL Verification Team
Hi, No, this is not a duplicate bug of # 69623. Also, 5.5 has it's own recovery process that also did not consult doublewrite buffer. Code is differently organized, but it also missed on checking what is available.
[27 Aug 2013 9:25]
Guangpu Feng
Marko I tried your method against Percona server 5.5.18, but can't repeat, following is the script:(sleep time is shorter than the *for* execution time to make sure kill is performed during the query) --------------------------------------------------------------- $vim bug70087.sh #!/bin/bash mysql="mysql -uroot -S /tmp/mysql.sock" `$mysql -e 'DROP TABLE IF EXISTS test.t; CREATE TABLE test.t(b BLOB)ENGINE=INNODB;'` ( sleep 5 kill -9 `pidof mysqld` ) & for i in i{1..2000} do `$mysql -e "BEGIN; INSERT INTO test.t VALUES(REPEAT('blob ',12345)); ROLLBACK;"` done --------------------------------------------------------------- can anybody provide a test case that can definitely repeat this bug?
[19 Dec 2013 17:50]
Daniel Price
Fixed as of 5.6.16, 5.7.4: "If the first page (page 0) of file-per-table tablespace data file was corrupt, recovery would be halted even though the doublewrite buffer contained a clean copy of the page." Thank you for the bug report.
[3 Feb 2014 11:50]
Laurynas Biveinis
5.6$ bzr log -r 5703 ------------------------------------------------------------ revno: 5703 committer: Annamalai Gurusami <annamalai.gurusami@oracle.com> branch nick: mysql-5.6 timestamp: Thu 2013-12-19 13:20:50 +0530 message: Bug #17335427 INNODB CAN NOT USE THE DOUBLEWRITE BUFFER PROPERLY Problem: If the first page (page 0) of the single table tablespace is corrupted in the data file then our recovery doesn't progress even if there is a clean copy of the same available in the double write buffer. Analysis: During recovery, our first step is to process the double write buffer. We look at the pages in the double write buffer and determine its (space_id, page_no) details. Each of the page in the double write buffer corresponds to a page in the .ibd data file. Using the space_id information we need to map the page in the double write buffer to the corresponding ibd file. This is done by reading the space_id information from the first page of the single table tablespace. If the first page of the single table tablespace is corrupted, then we are unable to determine the data file to which a particular page in the double write buffer belongs to. So we need to explore and see if we can determine the space_id in other means. Solution: Assume a particular page size. Read N number of pages from the ibd file. Ignore the corrupted pages and determine the (space_id, page_size and zip_size) information. Repeat this for all supported page sizes. Using this approach determine the correct (space_id, page_size and zip_size) of the ibd file. rb#4025 approved by Yasufumi.
[3 Feb 2014 11:53]
Laurynas Biveinis
5.6$ bzr log -r 5707 ------------------------------------------------------------ revno: 5707 committer: Annamalai Gurusami <annamalai.gurusami@oracle.com> branch nick: mysql-5.6 timestamp: Fri 2013-12-20 12:05:46 +0530 message: BUG 17335427 - INNODB CAN NOT USE THE DOUBLEWRITE BUFFER PROPERLY Problem: Fixing a memory issue in my original fix. This was identified from PB2 failures. If the page is uncompressed, then its size must be equal to UNIV_PAGE_SIZE. The buf_page_is_corrupted() assumes the size of the uncompressed pages as equal to UNIV_PAGE_SIZE. Solution: Call buf_page_is_corrupted() for uncompressed pages only if page size is equal to UNIV_PAGE_SIZE. approved by Yasufumi over IM.
[28 Mar 2014 19:23]
Laurynas Biveinis
5.6$ bzr log -r 5776 -n0 ------------------------------------------------------------ revno: 5776 committer: Marko Mäkelä <marko.makela@oracle.com> branch nick: mysql-5.6 timestamp: Tue 2014-01-28 12:02:37 +0200 message: Bug#17335427 INNODB CAN NOT USE THE DOUBLEWRITE BUFFER PROPERLY Clean up the test a little.
[22 Aug 2014 15:40]
Jeremy Cole
This fix appears to have introduced a regression for legitimate zero-checksum pages which are now seen as corrupt. I filed a bug for the regression here: http://bugs.mysql.com/bug.php?id=73689