MySQL Bugs: #65443: Failure when applying insert buffer changes during crash recovery

Bug #65443	Failure when applying insert buffer changes during crash recovery
Submitted:	28 May 2012 21:18	Modified:	18 Sep 2012 10:07
Reporter:	Nizameddin Ordulu	Email Updates:
Status:	Duplicate	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S1 (Critical)
Version:	5.6	OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	crash recovery, debug assertion, insert buffer

Description:
One of our stress tests for mysql 5.6 that test the crash recovery fails giving the following stack trace:

120527 8:41:52 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 391725843
InnoDB: Transaction 273144 was in the XA prepared state.
InnoDB: 18 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 49 row operations to undo
InnoDB: Trx id counter is 273664
120527 8:41:52 InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 120527 8:41:52 InnoDB: Assertion failure in thread 140022514702080 in file ibuf0ibuf.cc line 3833
InnoDB: Failing assertion: rec_get_deleted_flag(rec, page_is_comp(page))
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
05:41:52 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=0
max_threads=151
thread_count=0
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 61057 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
/data/users/jenkins/workspace/mysql-5.6-nightly/BUILD_TYPE/Debug/TEST_SET/StressLong/build-5.6/sql/mysqld(my_print_stacktrace+0x2e)[0x917eee]
/data/users/jenkins/workspace/mysql-5.6-nightly/BUILD_TYPE/Debug/TEST_SET/StressLong/build-5.6/sql/mysqld(handle_fatal_signal+0x484)[0x686c34]
/usr/local/fbcode/gcc-4.6.2-glibc-2.13/lib/libpthread.so.0(+0x105a0)[0x7f5993c8d5a0]
/usr/local/fbcode/gcc-4.6.2-glibc-2.13/lib/libc.so.6(gsignal+0x3e)[0x7f5992b1764e]
/usr/local/fbcode/gcc-4.6.2-glibc-2.13/lib/libc.so.6(abort+0x17b)[0x7f5992b1901b]
/data/users/jenkins/workspace/mysql-5.6-nightly/BUILD_TYPE/Debug/TEST_SET/StressLong/build-5.6/sql/mysqld[0x9d8d9b]
/data/users/jenkins/workspace/mysql-5.6-nightly/BUILD_TYPE/Debug/TEST_SET/StressLong/build-5.6/sql/mysqld[0xb14178]
/data/users/jenkins/workspace/mysql-5.6-nightly/BUILD_TYPE/Debug/TEST_SET/StressLong/build-5.6/sql/mysqld[0xb77be8]
/data/users/jenkins/workspace/mysql-5.6-nightly/BUILD_TYPE/Debug/TEST_SET/StressLong/build-5.6/sql/mysqld[0xaa3a08]
/usr/local/fbcode/gcc-4.6.2-glibc-2.13/lib/libpthread.so.0(+0x7f85)[0x7f5993c84f85]
/usr/local/fbcode/gcc-4.6.2-glibc-2.13/lib/libc.so.6(clone+0x6d)[0x7f5992bc563d]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file

The assertion that fails is at ibuf0ibuf.cc:3833. This is a debug-only assert that did not exist in earlier versions of 5.1. (Although it seems to exist on the latest branch on 5.1)

Can you comment on what this assertion is supposed to do and why it's debug-only?

How to repeat:
working on it.

This is likely a duplicate of Bug #58571. Check last comments from Marko there, they may give some hints.

For 5.1, the assert in ibuf_insert_to_index_page is new with 

3351.14.276 marko.m | 
                    |           /* This is based on
                    |           row_ins_sec_index_entry_by_modify(BTR_MODIFY_LEAF). */
                    |           ut_ad(rec_get_deleted_flag(rec, page_is_comp(page)));
                    | 

And that comes from

revno: 3513 [merge]
committer: Georgi Kodinov <Georgi.Kodinov@Oracle.com>
branch nick: merge-5.1-security
timestamp: Fri 2010-10-29 14:02:49 +0300
message:
  merge to 5.1-security
    ------------------------------------------------------------
    revno: 3452.1.96 [merge]
    committer: Vasil Dimov <vasil.dimov@oracle.com>
    branch nick: mysql-5.1-bugteam
    timestamp: Fri 2010-10-29 10:45:11 +0300
    message:
      Merge mysql-5.1-innodb -> mysql-5.1-bugteam
        ------------------------------------------------------------
        revno: 3351.14.276
        committer: Marko Mäkelä <marko.makela@oracle.com>
        branch nick: 5.1-innodb
        timestamp: Tue 2010-10-19 09:04:15 +0300
        message:
          Bug #56680 wrong InnoDB results from a case-insensitive covering index
          
          row_search_for_mysql(): When a secondary index record might not be
          visible in the current transaction's read view and we consult the
          clustered index and optionally some undo log records, return the
          relevant columns of the clustered index record to MySQL instead of the
          secondary index record.
          
          ibuf_insert_to_index_page_low(): New function, refactored from
          ibuf_insert_to_index_page().
          
          ibuf_insert_to_index_page(): When we are inserting a record in place
          of a delete-marked record and some fields of the record differ, update
          that record just like row_ins_sec_index_entry_by_modify() would do.
          
          btr_cur_update_alloc_zip(): Make the function public.
          
          mysql_row_templ_t: Add clust_rec_field_no.
          
          row_sel_store_mysql_rec(), row_sel_push_cache_row_for_mysql(): Add the
          flag rec_clust, for returning data at clust_rec_field_no instead of
          rec_field_no. Resurrect the debug assertion that the record not be
          marked for deletion. (Bug #55626)
          
          [UNIV_DEBUG || UNIV_IBUF_DEBUG] ibuf_debug, buf_page_get_gen(),
          buf_flush_page_try():
          Implement innodb_change_buffering_debug=1 for evicting pages from the
          buffer pool, so that change buffering will be attempted more
          frequently.

This looks like a logical duplicate of
Bug#58571 ut_ad assert rec_get_deleted_flag(rec, page_is_comp(page))

So far, we have failed to reproduce this with a simple test case.

Reproduction could become easier with 5.6.6, where we sometimes can trigger this bug in row0log.cc, in online ALTER TABLE that involves rebuilding the table.

This seems to be a duplicate of Bug#66819. My test case for this bug is not failing with the patch applied, and I was able to trigger this very assertion with instrumented code (fault injection).