MySQL Bugs: #61972: InnoDB assertion failure in page_is_comp(next

Bug #61972	InnoDB assertion failure in page_is_comp(next_page) in btr0pcur.c line 423
Submitted:	25 Jul 2011 19:15	Modified:	27 Oct 2011 17:44
Reporter:	Jeff Bilyk	Email Updates:
Status:	Duplicate	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S2 (Serious)
Version:	5.5.11, 5.5.14, 5.5.15	OS:	Linux (Alpine Linux 2.2.2)
Assigned to:		CPU Architecture:	Any

Description:
When suffering from this bug, InnoDB will crash with the log below.  This issue increases in frequency as time goes on, eventually crashing InnoDB every 20 seconds or so.  From what I can tell, a corrupted row in one table takes down the server.  Repair table isn't able to fix the damage.  The only way of fixing it as it stands is to stop mysql, remove all contents of /var/lib/mysql/, setup the server again, restore from backup, then replay the binlogs to get back to where we were (this doesn't seem to recreate the corruption).  While the issue is taking place, we're not able to backup the database using mysqldump, since it'll crash in the corrupted row.  Running mysqld with the innodb_force_recovery option set to either 4 or 6 does not help to allow a mysqldump.

The platform that this is running on:
AMD Athlon 4450B
3GB Ram
Alpine Linux 2.2.2 (/ is mounted as tmpfs, but /var/lib/mysql is mounted on an ext3 formatted md0 raid 1 array)
Linux Kernel: 2.6.38.2-grsec

The my.cnf for the server:
[client]
port            = 3306
socket          = /var/run/mysqld/mysqld.sock
[mysqld]
port            = 3306
socket          = /var/run/mysqld/mysqld.sock
skip-external-locking
key_buffer_size = 16M
max_allowed_packet = 1M
table_open_cache = 64
sort_buffer_size = 512K
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
myisam_sort_buffer_size = 8M
auto_increment_increment = 2
auto_increment_offset = 1
skip-slave-start
log-bin=mysql-bin
binlog_format=mixed
server-id       = 1
[mysqldump]
quick
max_allowed_packet = 16M
[mysql]
no-auto-rehash
[myisamchk]
key_buffer_size = 20M
sort_buffer_size = 20M
read_buffer = 2M
write_buffer = 2M
[mysqlhotcopy]
interactive-timeout

The section of the .err log from one of the crashes:
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:320, should be 0:1824!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:321, should be 0:1825!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:322, should be 0:1826!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:323, should be 0:1827!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:324, should be 0:1828!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:326, should be 0:1829!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:327, should be 0:1830!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:328, should be 0:1831!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:329, should be 0:1832!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:330, should be 0:1833!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:331, should be 0:1834!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:332, should be 0:1835!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:333, should be 0:1836!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:334, should be 0:1837!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:335, should be 0:1838!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:336, should be 0:1839!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:337, should be 0:1840!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:338, should be 0:1841!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:339, should be 0:1842!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:340, should be 0:1843!
110724 11:47:57  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:349, should be 0:1844!
110724 11:47:57  InnoDB: Assertion failure in thread 2818595744 in file
btr0pcur.c line 423
InnoDB: Failing assertion: page_is_comp(next_page) == page_is_comp(page)
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB:
http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
110724 11:47:57 - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this
binary
or one of the libraries it was linked against is corrupt, improperly
built,
or misconfigured. This error can also be caused by malfunctioning
hardware.
We will try our best to scrape up some info that will hopefully help
diagnose
the problem, but since we have already crashed, something is definitely
wrong
and this may fail.

key_buffer_size=16777216
read_buffer_size=262144
max_used_connections=1
max_threads=151
thread_count=1
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads =
133434 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0xb7e8aef0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0xa8005408 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x46)[0xb734d855]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0xb7ef89c0): SELECT /*!40001 SQL_NO_CACHE */ * FROM
`Notification`
Connection ID (thread ID): 2
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html
contains
information that should help you find out what is causing the crash.
110724 11:47:57 mysqld_safe Number of processes running now: 0
110724 11:47:57 mysqld_safe mysqld restarted
110724 11:47:57 InnoDB: The InnoDB memory heap is disabled
110724 11:47:57 InnoDB: Mutexes and rw_locks use GCC atomic builtins
110724 11:47:57 InnoDB: Compressed tables use zlib 1.2.5
110724 11:47:57 InnoDB: Using Linux native AIO
110724 11:47:57 InnoDB: Initializing buffer pool, size = 128.0M
110724 11:47:57 InnoDB: Completed initialization of buffer pool
110724 11:47:57 InnoDB: highest supported file format is Barracuda.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
110724 11:47:57  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Last MySQL binlog file position 0 5875, file name
./mysql-bin.000972
110724 11:47:57  InnoDB: Waiting for the background threads to start
110724 11:47:58 InnoDB: 1.1.8 started; log sequence number 12188408244
110724 11:47:58 [Note] Recovering after a crash using mysql-bin
110724 11:47:58 [Note] Starting crash recovery...
110724 11:47:58 [Note] Crash recovery finished.
110724 11:47:58 [Note] Event Scheduler: Loaded 0 events
110724 11:47:58 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.14-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306
Source distribution

If there is further information that's needed to track down the bug, please let know.

How to repeat:
Unknown at this point, I'm unable to replicate.

Check your hardware (hard drive) for failures.

Short SMART tests on the HDDs (which we run regularly through Nagios) weren't showing anything, but one has some DMA failures on the extended tests, that seems to be the culprit here...  If possible, could the ticket stay open (maybe at a lower prio) for a week or two?  Since it's now happened twice in 2 weeks, that should be long enough to confirm that the hardware was the only issue.

Also, just to check for all possible reasons, let me ask: did you move files when the database was running, or remove ib_logfiles when recovery was necessary?

Valeriy, we did not move files while mysqld was running.  When attempting recovery,we tried the following (but didn't touch ib_logfiles):

- table repair (failed)
- restart mysql with innodb_force_recovery set first to 4 then 6 and then attempt table repair (failed)
- still with innodb_force_recovery on, tried mysqldump to dump tables, views, and stored procs (failed)
- tried mysqldump without innodb_force_recovery on (failed)
- stopped mysql, removed contents of /var/lib/mysql, re-ran mysql setup, then restored DB from a previous mysqldump, then replayed binlogs by dumping mysqlbinlog output to a file, then importing that content to the DB

That last step seemed to do the trick to restore, but then this morning it looks like there is still some corruption since mysqld crashed again.  Could it be that the InnoDB corruption got re-introduced after restoring the mysqldump backup file?

This issue seems to have been resolved by the hard drive replacement.  Thank you all for you quick help.

Reopened this bug since the same issue has re-occured after swapping out all hardware on the box (including both hard drives).  Both hard drive pass a SMART test.  Error from the last crash is as output below.  Please let me know if more information is needed.

110904 17:46:50  InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 0:4636, should be 0:14074!
110904 17:46:50  InnoDB: Assertion failure in thread 2818890656 in file btr0pcur.c line 423
InnoDB: Failing assertion: page_is_comp(next_page) == page_is_comp(page)
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
110904 17:46:50 - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=16777216
read_buffer_size=262144
max_used_connections=3
max_threads=151
thread_count=2
connection_count=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 133434 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0xb8373598
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0xa804d408 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x46)[0xb73c6529]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0xb81f5d00): DELETE
                        FROM EmailQueue
                        WHERE CreationDateTime <  NAME_CONST('_CutOffDate',_latin1'2010-09-04 17:46:47' COLLATE 'latin1_swedish_ci')
Connection ID (thread ID): 247
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
110904 17:46:50 mysqld_safe Number of processes running now: 0
110904 17:46:50 mysqld_safe mysqld restarted
110904 17:46:50 InnoDB: The InnoDB memory heap is disabled
110904 17:46:50 InnoDB: Mutexes and rw_locks use GCC atomic builtins
110904 17:46:50 InnoDB: Compressed tables use zlib 1.2.5
110904 17:46:50 InnoDB: Using Linux native AIO
110904 17:46:50 InnoDB: Initializing buffer pool, size = 128.0M
110904 17:46:50 InnoDB: Completed initialization of buffer pool
110904 17:46:50 InnoDB: highest supported file format is Barracuda.
InnoDB: Log scan progressed past the checkpoint lsn 40267028985
110904 17:46:50  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 40268328083
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 9 row operations to undo
InnoDB: Trx id counter is 141C1C00
110904 17:46:50  InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 993686007, file name ./mysql-bin.000020
InnoDB: Starting in background the rollback of uncommitted transactions
110904 17:46:51  InnoDB: Rolling back trx with id 141C1AAA, 9 rows to undo
110904 17:46:51  InnoDB: Waiting for the background threads to start

InnoDB: Rolling back of trx id 141C1AAA completed
110904 17:46:51  InnoDB: Rollback of non-prepared transactions completed
110904 17:46:52 InnoDB: 1.1.8 started; log sequence number 40268328083
110904 17:46:52 [Note] Recovering after a crash using mysql-bin
110904 17:47:10 [Note] Starting crash recovery...
110904 17:47:10 [Note] Crash recovery finished.
110904 17:47:10 [Note] Event Scheduler: Loaded 0 events
110904 17:47:10 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.15-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  Source distribution

Ticket #62634 (http://bugs.mysql.com/bug.php?id=62634) has been created that has updated information on specific errors and configuration that are being used, so closing this ticket as a duplicate.