MySQL Bugs: #23710: crash_commit_before fails if innodb_file_per

Bug #23710	crash_commit_before fails if innodb_file_per_table=1
Submitted:	27 Oct 2006 8:54	Modified:	18 Jun 2010 23:08
Reporter:	Marko Mäkelä	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S2 (Serious)
Version:	5.1-bk	OS:	Linux (Linux)
Assigned to:	Inaam Rana	CPU Architecture:	Any
Tags:	innodb crash corruption

Description:
All other tests in the 5.1 test suite pass if innodb_file_per_table is forced on, but crash_commit_before fails if it is not the first test started by ./mysql-test-run.pl.

How to repeat:
cd mysql-test
echo '--skip-stack-trace --skip-core-file --innodb_file_per_table=1'> t/crash_commit_before-master.opt
./mysql-test-run.pl count_distinct crash_commit_before

Verified just as described, with ChangeSet@1.2320, 2006-10-20 11:15:54+02:00 on Linux.

The test fails with this:

mysqltest: At line 26: query 'SHOW CREATE TABLE t1' failed: 2013: Lost connection to MySQL server during query

In mysql-test/var/log/master.err, I see this:

Version: '5.1.13-beta-debug-log'  socket: '/home/msmakela/innobase/dev/mysql-5.1-build/mysql-test/var/tmp/master.sock'  port: 9306  Source distribution
061027 14:53:16 [Note] SCHEDULER: Loaded 0 events
061027 14:53:16  InnoDB: Assertion failure in thread 3038460848 in file btr0btr.c line 136
InnoDB: Failing assertion: (ibool)!!page_is_comp(root) == dict_table_is_comp(index->table)

This looks like an InnoDB bug. The row_format=compact flag in SYS_TABLES disagrees with the format of the B-tree.

The assertion in btr_root_get() fails, because the page will be filled with zeroes.

This looks like a race condition to me. If I load the SQL statements from crash_commit_before.test, everything works as it is supposed to. If I load count_distinct.test and then crash_commit_before.test in the same mysqld instance, the assertion will fail after crash recovery when table t1 is accessed. Quoting count_distinct.test:

CREATE TABLE t1(a int) engine=innodb;
START TRANSACTION;
insert into t1 values(9);
SET SESSION debug="d,crash_commit_before";
COMMIT; -- this is where the intentional crash occurs
SHOW CREATE TABLE t1;
SELECT * FROM t1;

I'm assigning this to Heikki, because he implemented innodb_file_per_table=1 and knows it best.

The reason probably is the long-standing bug in InnoDB: it does not make a checkpoint until after a few seconds of mysqld uptime. If mysqld crashes very quickly after the startup, InnoDB does not apply the redo log, which causes the database to be corrupt.

A probable fix: force a checkpoint immediately in the startup.

(gdb) run
Starting program: /home/heikki/mysql-5.1/sql/mysqld
[Thread debugging using libthread_db enabled]
[New Thread 1075701536 (LWP 6774)]
061124 12:18:07 [Warning] Changed limits: max_open_files: 1024  max_connections: 886  table_cache: 64
[New Thread 1129331632 (LWP 6775)]
[New Thread 1137720240 (LWP 6776)]
[New Thread 1146108848 (LWP 6777)]
[New Thread 1154497456 (LWP 6778)]
InnoDB: Warning: we did not need to do crash recovery, but log scan
InnoDB: progressed past the checkpoint lsn 0 794564253 up to lsn 0 794567249
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 1 row operations to undo
InnoDB: Trx id counter is 0 176640
[New Thread 1167666096 (LWP 6779)]
[New Thread 1176054704 (LWP 6780)]
[New Thread 1184443312 (LWP 6781)]
[New Thread 1192831920 (LWP 6782)]
061124 12:18:07  InnoDB: Started; log sequence number 0 794564253
/home/heikki/mysql-5.1/sql/mysqld: Table 'mysql.plugin' doesn't exist
061124 12:18:07 [ERROR] Can't open the mysql.plugin table. Please run the mysql_upgrade script to create it.
061124 12:18:07 [Note] Recovering after a crash using 1
061124 12:18:07 [Note] Starting crash recovery...
061124 12:18:07 [Note] Crash recovery finished.
InnoDB: Starting in background the rollback of uncommitted transactions
061124 12:18:07  InnoDB: Rolling back trx with id 0 176129, 1 rows to undo
InnoDB: Page directory corruption: infimum not pointed to
061124 12:18:07  InnoDB: Page dump in ascii and hex (16384 bytes):
 len 16384; hex 0000000000[New Thread 1201421232 (LWP 6783)]
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

Hmm... InnoDB does make a checkpoint right in the startup. But the problem is that the lsn in the checkpoint is the same as in the data file header. In the next startup, InnoDB sees the lsn the same in the checkpoint and in the data file header, which makes InnoDB to think that there was no crash.

Fix: do a little modifying mtr in the startup and AFTER that make a new checkpoint.

Hmm... a problem is that we need to FLUSH a few buf pool pages before the lsn in the checkpoint is advanced. Maybe better to fix the bug by decrementing the lsn stamp in ibdata1 in a startup?

Bug #28283 was marked as a duplicate of this.

Assigning this to Inaam. This bug has fundamental importance, because if InnoDB crashes very soon after a startup, its crash recovery can fail.

A fix might be to make sure that the log sequence number and the checkpoint are immediately increased when InnoDB starts. A way is to modify something in the data files and ask InnoDB to flush a few pages from the buffer pool and do a checkpoint immediately after a startup. But is there a more elegant solution?

--Heikki

http://bugs.mysql.com/bug.php?id=29221 may be another duplicate of this.

Inaam has a patch for this.

Queued to 5.1-maint

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31647

ChangeSet@1.2555, 2007-07-24 19:34:31-06:00, tsmith@ramayana.hindu.god +25 -0
  Apply snapshot innodb-51-ss1644
  
  Fixes:
  - Bug #23710: crash_commit_before fails if innodb_file_per_table=1
  - Bug #29097: fsp_get_available_space_in_free_extents() is capped at 4TB
  - Bug #29155: Innodb "Parallel recovery" is not prevented

Pushed into 5.1.21-beta

Noted in 5.1.21 changelog.

If MySQL/InnoDB crashed very quickly after starting up, it would
not force a checkpoint.  In this case, InnoDB would skip crash
recovery at next startup, and the database would become corrupt.
Fix: If the redo log scan at InnoDB startup goes past the last
checkpoint, force crash recovery.

Queued to 5.0-maint

Pushed into 5.0.48

Pushed into 5.1.22-beta

Noted in 5.0.48 changelog.

Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.

Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)

Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.

Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)