Description:
Logging on startup after a failover took a long time without saying anything.
140224 18:05:02 mysqld_safe Starting mysqld daemon with databases from /path/to/datadir
2014-02-24 18:05:03 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2014-02-24 18:05:03 27734 [Note] Plugin 'FEDERATED' is disabled.
2014-02-24 18:05:03 27734 [Note] InnoDB: The InnoDB memory heap is disabled
2014-02-24 18:05:03 27734 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2014-02-24 18:05:03 27734 [Note] InnoDB: Compressed tables use zlib 1.2.3
2014-02-24 18:05:03 27734 [Note] InnoDB: Using Linux native AIO
2014-02-24 18:05:03 27734 [Note] InnoDB: Using CPU crc32 instructions
2014-02-24 18:05:03 27734 [Note] InnoDB: Initializing buffer pool, size = 120.0G
2014-02-24 18:05:10 27734 [Note] InnoDB: Completed initialization of buffer pool
2014-02-24 18:05:11 27734 [Note] InnoDB: Highest supported file format is Barracuda.
2014-02-24 18:05:12 27734 [Note] InnoDB: The log sequence numbers 42094814939579 and 42094814939579 in ibdata files do not match the log sequence number 52468941157960 in the ib_logfiles!
2014-02-24 18:05:12 27734 [Note] InnoDB: Database was not shutdown normally!
2014-02-24 18:05:12 27734 [Note] InnoDB: Starting crash recovery.
2014-02-24 18:05:12 27734 [Note] InnoDB: Reading tablespace information from the .ibd files... <<=======
2014-02-24 19:10:07 27734 [Note] InnoDB: Restoring possible half-written data pages <<=======
2014-02-24 19:10:07 27734 [Note] InnoDB: from the doublewrite buffer...
2014-02-24 19:10:57 27734 [Note] InnoDB: 128 rollback segment(s) are active.
2014-02-24 19:10:57 27734 [Note] InnoDB: Waiting for purge to start
2014-02-24 19:10:57 27734 [Note] InnoDB: 5.6.15 started; log sequence number 52468941157960
2014-02-24 19:10:57 7f6da358e700 InnoDB: Loading buffer pool(s) from .//ib_buffer_pool
2014-02-24 19:10:57 27734 [Note] Server hostname (bind-address): '0.0.0.0'; port: 3306
2014-02-24 19:10:57 27734 [Note] - '0.0.0.0' resolves to '0.0.0.0';
2014-02-24 19:10:57 27734 [Note] Server socket created on IP: '0.0.0.0'.
2014-02-24 19:10:58 27734 [Note] Event Scheduler: Loaded 0 events
2014-02-24 19:10:58 27734 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.15-log' socket: 'mysql.sock' port: 3306 MySQL Community Server (GPL)
2014-02-24 19:10:58 27734 [Note] Event Scheduler: scheduler thread started with id 1
2014-02-24 19:29:07 7f6da358e700 InnoDB: Buffer pool(s) load completed at 140224 19:29:07
Notice the 1 hour 5 minute delay while the recovery is taking place that's completely silent.
When I first saw this I thought mysqld was "stuck" and killed it only to try again. It was only afterwards I realised it was actually doing something. Note: CPU and I/O was not 100% so it's not clear to me exactly what mysqld was doing. The server has several cores yet this process seems to be single threaded.
How to repeat:
Note: this server had a large number of databases and tables:
[root@myserver ~]# cd $datadir
[root@myserver data]# find . -type d | wc -l
10779
[root@ data]# find . -name "*.ibd" | wc -l
167647
[root@ data]#
The total instance size is 33 TB.
The issue was caused by a storage problem which required mysqld to be killed and the filesystem remounted. While this is not MySQL's fault under other recovery scenarios recovery is quite verbose, so it's clear that something is happening. In this case there was no visibility to the process.
Suggested fix:
(1) Make the logging more verbose. I suggest that if nothing has been logged in 10 seconds then indicate the current state of progress, and if known how much more work needs to be done.
(2) If this process can be made to run faster that would also be good as 1 hour and 5 minutes to scan the InnoDB tables seems like a long period. Had the server been busy with large transactions that would have invoked an even longer recovery time. In this case that did not happen.
As databases and storage use grows it's likely that the number of files or databases may increase too.