Bug #77214 | InnoDB data lost if fast_shutdown=1 and trx_commit=0 | ||
---|---|---|---|
Submitted: | 1 Jun 2015 18:42 | Modified: | 26 Apr 2017 14:45 |
Reporter: | Rene' Cannao' | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
Version: | 5.6 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | innodb |
[1 Jun 2015 18:42]
Rene' Cannao'
[1 Jun 2015 18:48]
Rene' Cannao'
This bug is present in MySQL 5.6 , and probably on older versions too.
[3 Jun 2015 10:41]
Rene' Cannao'
Editing the title from trx_commit!=1 to trx_commit=0
[3 Jun 2015 10:46]
Rene' Cannao'
After more digging into the source code, only a combination of innodb_fast_shutdown=1 and innodb_flush_log_at_trx_commit=0 triggers data lost during a normal graceful shutdown. innodb_fast_shutdown=1 and innodb_flush_log_at_trx_commit=2 seems to be safe, because the content of the redo buffer is copied into the redo log file in trx_flush_log_if_needed() if innodb_flush_log_at_trx_commit=2 .
[3 Jun 2015 17:07]
Justin Swanhart
innodb_fast_shutdown does a sharp checkpoint. It flushes all dirty buffers, but does not flush the change buffer (it persists in tablespace 0). Because of the sharp checkpoint, there can't be anything in the log buffer, thus it need not be copied. If innodb_fast_shutdown=1 + innodb_flush_log_at_trx_commit=0 caused data loss, vast numbers of slaves would be out of sync with masters constantly, because they are frequently restarted, and they frequently have innodb_flush_log_at_trx_commit=0. So I do not think this is a bug.
[3 Jun 2015 17:09]
Justin Swanhart
I meant innodb_fast_shutdown=1 does sharp checkpoint. innodb_fast_shutdown=2 does not flush dirty buffers and recovery is needed, but nothing committed at that point is lost, because the log buffer is flushed. The log simply needs to replay over the pages changed since the last checkpoint, and if the LSN on the page has already been updated, it won't update the page again.
[3 Jun 2015 17:45]
MySQL Verification Team
trx_commit=0 is so designed that you can loose last transaction !!! This is in the manual !!!
[3 Jun 2015 18:25]
Riccardo Pizzi
Sinisa. the manual says that you can lose some transactions *in case of mysql crash*, we are talking about a clean shutdown here...
[3 Jun 2015 22:39]
Rene' Cannao'
@Justin, this is my understanding: In a sharp checkpoint, all the pages are flushed to disk for that specific checkpoint LSN . During shutdown, from what I see, all the flushing is done by buf_flush_page_cleaner_thread(), while the checkpoint itself is done by the background master thread in srv_master_do_shutdown_tasks() calling log_checkpoint() . log_checkpoint() calls log_buf_pool_get_oldest_modification() to determine the lsn to checkpoint, and log_buf_pool_get_oldest_modification() returns the oldest modified block lsn in the buffer pool, or log_sys->lsn if none exists. As checkpoint and flushing is done in two different threads, race conditions may happens. To be more specific, I believe it is possible that while buf_flush_page_cleaner_thread() is still flushing pages (so there are still dirty pages in buffer pool), the background master thread exists without making a checkpoint up to log_sys->lsn . In fact, reading the comments in buf_flush_page_cleaner_thread(), it is expected that the master and the purge threads exit while buf_flush_page_cleaner_thread() it is still running. I believe that, if the master thread exits before the buffer pool has been flushed completely, log_checkpoint() shouldn't be enough to ensure that the latest transactions were copied from redo buffer into redo log. I still believe that srv_sync_log_buffer_in_background() should be called even if innodb_fast_shutdown=1 . @Sinisa: I assume a clean shutdown never causes data lost, no matter the value of innodb_flush_log_at_trx_commit .
[4 Jun 2015 4:42]
Inaam Rana
InnoDB follows WAL (write ahead logging) which implies that no dirty page can ever be written to disk before the relevant redo entries have made it to the disk. This is true whatever innodb_flush_log_at_trx_commit setting. There is a call in buf_flush_write_block_low() to log_write_up_to() which ensures that. Given above, when page_cleaner goes through the entire buffer pool writing down all pages during shutdown it is guaranteed that redo log will get written to the disk. The final checkpoint is not made in the master thread. It is made in the thread triggering the shutdown after the page_cleaner has exited. IIRC after the checkpoint is made and flushed to disk the checkpoint lsn is stamped on the system tablespace header page as well.
[26 Apr 2017 14:45]
MySQL Verification Team
My dear Rene, As you well know we need a fully repeatable case, in order to be able to verify the bug. Also, this bug is not a duplicate of #85585.