| Bug #70899 | unnecessary buf_flush_list() during recovery | ||
|---|---|---|---|
| Submitted: | 13 Nov 2013 16:39 | Modified: | 7 May 2014 19:24 |
| Reporter: | Inaam Rana (OCA) | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
| Version: | 5.6.14 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[13 Nov 2013 17:49]
Inaam Rana
I should clarify that I believe recv_needed_recovery check is safe because in trx_sys_create_rsegs() we create rsegs iff recv_needed_recovery is not set.
[14 Nov 2013 18:10]
Sveta Smirnova
Thank you for the report. Verified as described using code analysis.
[7 May 2014 19:24]
Daniel Price
Fixed as of 5.6.20, 5.7.5 and here's the changelog entry: The fix for Bug#16418661 added superfluous "buf_flush_list()" logic to "InnoDB" startup code. Thank you for the bug report.
[6 Aug 2014 17:14]
Laurynas Biveinis
$ bzr log -n0 -r 5922
------------------------------------------------------------
revno: 5922
committer: Satya Bodapati <satya.bodapati@oracle.com>
branch nick: mysql-5.6
timestamp: Wed 2014-05-07 19:36:11 +0530
message:
BUG#17798076 - BUG#16418661 CLEANUP: REMOVE UNNECESSARY BUF_FLUSH_LIST()
DURING RECOVERY
Why and when the extra buf_flush_list() was added?
--------------------------------------------------
The extra buf_flush_list() was added as part of rb#2293.
The testcase for rb#2293 failed randomly reporting inconsistent number
of undo tablespaces. MTR didn't provide a way to use custom
innodb_data_file_path, innodb_undo_tablespaces. See Bug#17059436 So
the workaround is to use "--exec $MYSQLD --args my.cnf" on a empty data
directory. This will create necessary files and aborts after that.
The extra flush made sure that correct the number of undo tablespaces
reported. This is not correct because the changes are already covered by redo.
So the the ideal solution would be have been to run "--exec $MYSQLD on
the datadir" again to apply redo or allow the bootstrapping to finish
properly and exit.
Fix:
----
Remove uncessary flush. Fix the inconsistent number of undo tablespaces
issue by passing '--skip-grant-tables --innodb-unkown-parameter' to $MYSQLD.
This would allow the bootstrapping to finish and properly exit.
Approved by Marko, Kevin. rb#5234, rb#5320

Description: In InnoDB startup code innobase_start_or_create_for_mysql() after creating rsegs we flush the entire buffer pool. The intent is to force trx_sys page to the disk. We can reach this code path after doing recovery. In this case we can potentially have millions of dirty pages in the buffer pool. This can seriously increase the recovery time. 2554 srv_available_undo_logs = trx_sys_create_rsegs( 2555 srv_undo_tablespaces, srv_undo_logs); 2556 2557 if (srv_available_undo_logs == ULINT_UNDEFINED) { 2558 /* Can only happen if force recovery is set. */ 2559 ut_a(srv_force_recovery >= SRV_FORCE_NO_TRX_UNDO 2560 || srv_read_only_mode); 2561 srv_undo_logs = ULONG_UNDEFINED; 2562 } 2563 2564 /* Flush the changes made to TRX_SYS_PAGE by trx_sys_create_rsegs()*/ 2565 if (!srv_force_recovery && !srv_read_only_mode) { 2566 bool success = buf_flush_list(ULINT_MAX, LSN_MAX, NULL); 2567 ut_a(success); 2568 buf_flush_wait_batch_end(NULL, BUF_FLUSH_LIST); 2569 } 2570 How to repeat: Crash the server with large enough buffer pool to have millions of dirty pages. Allow recovery to proceed. Try to connect to the server after redo scan and redo apply phase are finished. The server will be inaccessible and a lot of flushing will be happening. In my case I had a buffer pool size of 64G, redo logs 4G and around 2.3 million dirty pages when I crashed the server. During recovery: redo scan: nearly 2 minutes redo apply: nearly 2 minutes server stuck in flushing: nearly 11 minutes and the stack during flushing looked like: (gdb) where #0 0x00000000009cde99 in ut_fold_binary (page=<value optimized out>) at /home/irana/workspace/mysql-5.6/storage/innobase/include/ut0rnd.ic:208 #1 buf_calc_page_new_checksum (page=<value optimized out>) at /home/irana/workspace/mysql-5.6/storage/innobase/buf/buf0checksum.cc:99 #2 0x00000000009cf4d3 in buf_flush_init_for_writing (page=0x7f1254794000 "\377r<\006", page_zip_=0x0, newest_lsn=382565654961) at /home/irana/workspace/mysql-5.6/storage/innobase/buf/buf0flu.cc:790 #3 0x00000000009d0451 in buf_flush_write_block_low (buf_pool=0x242f798, bpage=0x7f122ca6d1d0, flush_type=BUF_FLUSH_LIST, sync=false) at /home/irana/workspace/mysql-5.6/storage/innobase/buf/buf0flu.cc:908 #4 buf_flush_page (buf_pool=0x242f798, bpage=0x7f122ca6d1d0, flush_type=BUF_FLUSH_LIST, sync=false) at /home/irana/workspace/mysql-5.6/storage/innobase/buf/buf0flu.cc:1060 #5 0x00000000009d10b6 in buf_flush_try_neighbors (bpage=<value optimized out>, flush_type=BUF_FLUSH_LIST, n_to_flush=18446744073709551614, count=0x7fff9ea9ba50) at /home/irana/workspace/mysql-5.6/storage/innobase/buf/buf0flu.cc:1271 #6 buf_flush_page_and_try_neighbors (bpage=<value optimized out>, flush_type=BUF_FLUSH_LIST, n_to_flush=18446744073709551614, count=0x7fff9ea9ba50) at /home/irana/workspace/mysql-5.6/storage/innobase/buf/buf0flu.cc:1349 #7 0x00000000009d15be in buf_do_flush_list_batch (buf_pool=0x242f798, flush_type=<value optimized out>, min_n=18446744073709551614, lsn_limit=18446744073709551615) at /home/irana/workspace/mysql-5.6/storage/innobase/buf/buf0flu.cc:1596 #8 buf_flush_batch (buf_pool=0x242f798, flush_type=<value optimized out>, min_n=18446744073709551614, lsn_limit=18446744073709551615) at /home/irana/workspace/mysql-5.6/storage/innobase/buf/buf0flu.cc:1660 #9 0x00000000009d1f1e in buf_flush_list (min_n=18446744073709551614, lsn_limit=18446744073709551615, n_processed=0x0) at /home/irana/workspace/mysql-5.6/storage/innobase/buf/buf0flu.cc:1897 #10 0x000000000097debd in innobase_start_or_create_for_mysql () at /home/irana/workspace/mysql-5.6/storage/innobase/srv/srv0start.cc:2566 #11 0x00000000008d6a92 in innobase_init (p=<value optimized out>) at /home/irana/workspace/mysql-5.6/storage/innobase/handler/ha_innodb.cc:3375 #12 0x000000000054fca8 in ha_initialize_handlerton (plugin=0x23ca680) at /home/irana/workspace/mysql-5.6/sql/handler.cc:663 #13 0x00000000006b12eb in plugin_initialize (plugin=0x23ca680) at /home/irana/workspace/mysql-5.6/sql/sql_plugin.cc:1126 #14 0x00000000006b5636 in plugin_init (argc=0x126ecec, argv=0x236d718, flags=<value optimized out>) at /home/irana/workspace/mysql-5.6/sql/sql_plugin.cc:1417 #15 0x00000000005480d8 in init_server_components () at /home/irana/workspace/mysql-5.6/sql/mysqld.cc:4849 #16 0x000000000054a86f in mysqld_main (argc=47, argv=0x236d718) at /home/irana/workspace/mysql-5.6/sql/mysqld.cc:5449 #17 0x00007f1cc1219994 in __libc_start_main () from /lib64/libc.so.6 #18 0x0000000000540ce9 in _start () Suggested fix: I think the fix is trivial. Add check for recv_needed_recovery to the if condition. Also probably a good idea to move this whole buf_flush_list() logic inside trx_sys_create_rsegs(). Makes for a more error proof code.