Bug #61736 multiple unnecessary page reads done in latching code
Submitted: 4 Jul 2011 11:34 Modified: 27 Nov 2013 16:29
Reporter: Domas Mituzas Email Updates:
Status: Closed Impact on me:
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.1, ... OS:Any
Assigned to: CPU Architecture:Any

[4 Jul 2011 11:34] Domas Mituzas
the page latching algorithm for btree locks neighbor pages as well, and for that it does preads:

		/* x-latch also brothers from left to right */
		left_page_no = btr_page_get_prev(page, mtr);

		if (left_page_no != FIL_NULL) {
			get_block = btr_block_get(space, zip_size,
						  RW_X_LATCH, mtr);
			get_block->check_index_page_at_flush = TRUE;

		get_block = btr_block_get(space, zip_size, page_no,
					  RW_X_LATCH, mtr);
		get_block->check_index_page_at_flush = TRUE;

		right_page_no = btr_page_get_next(page, mtr);

		if (right_page_no != FIL_NULL) {
			get_block = btr_block_get(space, zip_size,
						  RW_X_LATCH, mtr);

this not only prolongs dictionary lock (Bug#61735), but also causes unnecessary I/O and bloats buffer pool.

How to repeat:
Run intensive write workload on a dataset that doesn't fit in memory. 

Suggested fix:
Don't read in pages.
[5 Jul 2011 6:41] Shane Bester
[13 Nov 2013 20:43] Daniel Price
Noted in 5.7.4 changelog:

"The page latching algorithm for b-trees would lock sibling leaf
pages, prolonging dictionary locks. The bug fix implements
prefectching of sibling leaf pages to reduce index lock holding

Thank you for the bug report.
[13 Nov 2013 20:52] Domas Mituzas
well, the bug also mentions that those reads are unnecessary, as they won't be used for reading data. putting them outside of lock scope does not answer the problem.
[14 Nov 2013 8:07] Yasufumi Kinoshita
http://bugs.mysql.com seems not to be affected by the oracle bug system correctly.

Title was changed to

And my comments not to be affected to mysql.com are...

[30 Jun 2013 20:40] Yasufumi Kinoshita (YAKINOSH)

At first, the reads are necessary. (to keep latching order, to update page
link in the level, to confirm which page should be merged with....)

I think the both side of the page should be prefetched (already async
requested) at the previous failing optimistic operation, because already we
know the pages are needed at the time.

It will help also Bug#12734247

[7 Nov 2013 19:02] Yasufumi Kinoshita (YAKINOSH)

The problem was read IO for sibling pages might be caused while
BTR_MODIFY_TREE having index->lock (X or SX).
At the moment of the previous BTR_MODIFY_LEAF failed, the siblings can be
started to read asynchronously.
This should reduce the index->lock (X or SX) holding time.

This fix is just performance related issue, and fixed only for the newest
branch for now.
If no counter effect will be shown for mysql-5.7, can be backported.
[27 Nov 2013 9:34] Erlend Dahl
Re-closing after Yasufumi's comments.
[27 Nov 2013 15:18] Mark Callaghan
Why is this closed? 

While the implementation might require them, that is inefficient. Lets be green and conserve our disk reads. Can we keep this open as a performance feature request?
[27 Nov 2013 16:29] Domas Mituzas
Indeed, original bug intent is to point out that there're inefficiencies in current model. Making inefficiencies faster does not eliminate them.
[29 Nov 2013 6:10] Yasufumi Kinoshita
Anyway reading is needed for most of the cases of modifying tree structure to update link for the page at the previous page and next page. And whether really needed or not is not able to be judged accurately before actual operation.

And to obey latching order, right direction latch on the level is only allowed not to cause deadlock. And needs to latch the pages at the same time to keep consistency. So, needs both sibling pages latch is needed before starting modifying the page which should cause modify_tree. And also for consistency of the index tree structure, we need to latch the leaf pages before release index->lock or equivalent.

If you reported here about "read io during modify_tree", this reads are needed in the end, but can be requested before modify_tree to avoid IO request serialization. So optimized for improvement latch holding time and scalability.

The latch is necessary, reads also necessary for most of the cases.
(left page update is needed for "delete page" and "insert page left")

So just only description "unnecessary page reads" is wrong or vague.

Please open another specific request with details again, if not satisfied with.

e.g. "modify_tree and update_blob should be separated to be optimized sibling latches/read"
     or "left page read and latch is not needed for right page insert"

[29 Nov 2013 16:18] Mark Callaghan
One example where the reads are not required but still done.