Bug #67718 | InnoDB drastically under-fills pages in certain conditions | ||
---|---|---|---|
Submitted: | 27 Nov 2012 0:54 | Modified: | 14 Jul 2014 12:23 |
Reporter: | Jeremy Cole (Basic Quality Contributor) (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S3 (Non-critical) |
Version: | All | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | innodb |
[27 Nov 2012 0:54]
Jeremy Cole
[27 Nov 2012 5:20]
Davi Arnaut
A possible way to fix this is to restrict this type of split to the rightmost page, which still preserves the optimization for ascending key inserts. This seems to be the approach taken in other databases.
[27 Nov 2012 16:36]
MySQL Verification Team
Jeremy, Is it same for all ROW_FORMATs ??? Just let me know whether you have tested it or have time to test it. I guess it is, but I am not sure for the COMPRESSED format ....
[27 Nov 2012 17:18]
Jeremy Cole
Sinisa: It should apply to all table formats, since the logic for this decision is completely independent of the table format, but I have not tested it because the exact number of rows required to exactly fill a page without splitting it will differ (so the test case will differ a bit).
[27 Nov 2012 17:20]
Jeremy Cole
Additionally, I had a working theory that this problem also applies to non-leaf pages, and although I have not produced a test case to duplicate it, I found instances of its "pattern" in our production tables. It would require quite a lot of inserts to produce a test case for that, since the B-tree needs to grow to a minimum of 3 levels deep.
[30 Nov 2012 12:16]
Valeriy Kravchuk
Sinisa, I wonder if you had used Oracle's 5.5.20 to verify and, if you had, how exactly you verified the bug? The test case used here refers to I_S.INNODB_BUFFER_PAGE to prove the problem, while the table does not exist in Oracle's 5.5.27 (and even less in 5.5.20): mysql> show variables like 'ver%'; +-------------------------+------------------------------+ | Variable_name | Value | +-------------------------+------------------------------+ | version | 5.5.27-log | | version_comment | MySQL Community Server (GPL) | | version_compile_machine | x86 | | version_compile_os | Win32 | +-------------------------+------------------------------+ 4 rows in set (0.00 sec) mysql> use information_schema Database changed mysql> show tables like 'innodb_b%'; Empty set (0.34 sec) mysql> show tables like 'innodb%'; +----------------------------------------+ | Tables_in_information_schema (innodb%) | +----------------------------------------+ | INNODB_CMP_RESET | | INNODB_TRX | | INNODB_CMPMEM_RESET | | INNODB_LOCK_WAITS | | INNODB_CMPMEM | | INNODB_CMP | | INNODB_LOCKS | +----------------------------------------+ 7 rows in set (0.05 sec) Surely, MySQL manual says the table is there in 5.5, see http://dev.mysql.com/doc/refman/5.5/en/innodb-i_s-tables.html. Do I miss anything, like a good way to study buffer pool content (or data pages by type on disk) in Oracle's MySQL 5.5.x?
[30 Nov 2012 14:33]
MySQL Verification Team
Valeriy, You are correct. I made mistake. I verified two bugs , one with 5.5.20 and this one with 5.6.7 and made a typo. So, verified with 5.6.7-rc. Not with 5.5.20.
[30 Nov 2012 18:44]
Jeremy Cole
Valeriy: The innodb_buffer_page table was introduced in MySQL 5.5.28.
[30 Nov 2012 18:44]
MySQL Verification Team
There are no plans yet on when will this behavior be changed. It is definitely not on the current roadmap, but it will be re-considered in the future.
[1 Dec 2012 23:51]
Jeremy Cole
Valeriy: If you have a working Ruby installation you should be able to install my innodb_ruby gem (gem install innodb_ruby), which will allow you to print the tablespace contents for any "compact" table (the old "redundant" format is not yet supported). $ sudo innodb_space -f /usr/local/mysql/data/test/page_split_test.ibd space-index-pages-summary page index level data free records 3 157 1 85 16167 5 4 157 0 7378 8872 7 5 157 0 14756 1492 14 6 157 0 1054 15198 1 7 157 0 1054 15198 1 8 157 0 1054 15198 1 9 0 0 0 16384 0 See: http://rubygems.org/gems/innodb_ruby See: https://github.com/jeremycole/innodb_ruby
[22 Jan 2013 19:27]
Davi Arnaut
Link to fix: https://github.com/twitter/mysql/commit/48996cad34e0f2e11068f92f86e1eb7eda1d87fe Apparently, "contributions can be accepted to Open bugs only".
[23 Jan 2013 13:41]
MySQL Verification Team
Davi, Thank you for your contribution. It will be very seriously analyzed and considered.
[24 Jan 2013 15:09]
Calvin Sun
The cause of the problem is from this change: http://lists.mysql.com/internals/12980?f=plain.
[30 Jan 2013 1:52]
Calvin Sun
Heikki proposed the following changes to address the issue: A better heuristic: if the number of records on the page is "big" (say, there are > 10 records), then we can require PAGE_N_DIRECTION > 5 to believe this is a sequential insert pattern. Something like this for btr_page_get_split_rec_to_left(): if ((page_header_get_ptr(page, PAGE_LAST_INSERT) == page_rec_get_next(insert_point)) && ((page_get_n_recs(page) < 10) || ((page_header_get_field(page, PAGE_DIRECTION) == PAGE_LEFT) && (page_header_get_field(page, PAGE_N_DIRECTION) >= 5) ) ) ) { /* Ok, it is fairly sure this really is consecutive inserts */ This change would make the split algorithm more conservative. Can this cause bad space utilization under some other workloads? Unfortunately, yes.
[21 Apr 2013 3:57]
Davi Arnaut
Here is a little bit simpler variation that exploits the way the root page is split: CREATE TABLE t1 (a BIGINT PRIMARY KEY, b VARCHAR(4096)) ENGINE=InnoDB; # Build b-tree so that the leftmost leaf page has records 0, 1 and 2. INSERT INTO t1 VALUES (0, REPEAT('a', 4096)); INSERT INTO t1 VALUES (1000, REPEAT('a', 4096)); INSERT INTO t1 VALUES (1001, REPEAT('a', 4096)); INSERT INTO t1 VALUES (1002, REPEAT('a', 4096)); INSERT INTO t1 VALUES (1, REPEAT('a', 4096)); INSERT INTO t1 VALUES (2, REPEAT('a', 4096)); # Each insert/record down from 999 will end up on its on page. INSERT INTO t1 VALUES (999, REPEAT('a', 4096)); INSERT INTO t1 VALUES (998, REPEAT('a', 4096));
[22 Apr 2013 13:46]
MySQL Verification Team
Davi, Thanks for your contribution.
[28 May 2014 9:51]
Yasufumi Kinoshita
Basically, InnoDB always attempts to insert record to the end of page rather than the first of the next page, if the insert record is between the pages. I think the better solution is... "If the insert point is the last of the page, try to insert to the next page before split the page." The changing splitting strategy affects to the other cases and might cause bad effect. But this fix doesn't affect to the other cases.
[23 Jun 2014 14:41]
Daniel Price
Fixed as of the upcoming 5.6.20, 5.7.5 release, and here's the changelog entry: Inserting a record into an "InnoDB" table with a key that falls between the maximum key of a full page and the minimum key of the "next" page could result in unnecessary page splits and under-filled pages. If the insert point is at the end of a page, "InnoDB" now attempts to insert to the next page before splitting the page. Thank you for the bug report.
[2 Jul 2014 11:24]
Daniel Price
The fix for "Bug #67718 InnoDB drastically under-fills pages in certain conditions" has been reverted due to a recently encountered regression. The fix does not appear in 5.6.20, as previously documented.
[2 Jul 2014 16:41]
Jeremy Cole
Ouch. Thanks for the try anyway. Looking forward to seeing the patch in the future. :)
[14 Jul 2014 12:23]
Daniel Price
The issue with the previous patch has been addressed. Fixed as of the upcoming 5.6.21, 5.7.5 release, and here's the changelog entry: Inserting a record into an "InnoDB" table with a key that falls between the maximum key of a full page and the minimum key of the "next" page could result in unnecessary page splits and under-filled pages. If the insert point is at the end of a page, "InnoDB" now attempts to insert to the next page before splitting the page. Thank you for the bug report.
[7 Aug 2014 6:46]
Laurynas Biveinis
$ bzr log -n0 -r 6008 ------------------------------------------------------------ revno: 6008 committer: Yasufumi Kinoshita <yasufumi.kinoshita@oracle.com> branch nick: mysql-5.6 timestamp: Fri 2014-06-20 08:48:48 +0200 message: Bug#15923864 : Bug#67718 : INNODB DRASTICALLY UNDER-FILLS PAGES IN CERTAIN CONDITIONS InnoDB should try to insert to the next page before split the page, if the insert record for split_and_insert is last of the page. Otherwise, the (reverse) sequential inserts to the middle of the index might cause 1 record per 1 page. Approved by Marko in rb#5563
[7 Aug 2014 6:58]
Laurynas Biveinis
$ bzr log -n0 -r 6050 ------------------------------------------------------------ revno: 6050 committer: Marko M?kel? <marko.makela@oracle.com> branch nick: mysql-5.6.20-release timestamp: Tue 2014-07-01 11:41:45 +0300 message: Revert Bug#15923864 fix (revno 6008) from MySQL 5.6.20.
[24 Sep 2014 17:15]
Daniel Price
bzr log -n0 -r 6073 ------------------------------------------------------------ revno: 6073 committer: Yasufumi Kinoshita <yasufumi.kinoshita@oracle.com> branch nick: mysql-5.6 timestamp: Mon 2014-07-14 15:06:43 +0900 message: This is the second attempt to fix the bug Bug#15923864 : Bug#67718 : INNODB DRASTICALLY UNDER-FILLS PAGES IN CERTAIN CONDITIONS InnoDB should try to insert to the next page before split the page, if the insert record for split_and_insert is last of the page. Otherwise, the (reverse) sequential inserts to the middle of the index might cause 1 record per 1 page. ------- The previous attempt had the bug and fixed already at this push. Bug#19045042 : ASSERT MACH_READ_FROM_8()==INDEX->ID RECV_RECOVERY_IS_ON() IS_INSIDE_IBUF() The new function btr_insert_into_right_sibling() misses to treat ibuf bitmap for the page which the record inserted.