Bug #43043 Crash on BLOB delete operation
Submitted: 20 Feb 2009 5:08 Modified: 20 Jun 2010 17:33
Reporter: Vadim TKACHENKO Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB Plugin storage engine Severity:S1 (Critical)
Version:1.0.2 OS:Any
Assigned to: Marko Mäkelä CPU Architecture:Any

[20 Feb 2009 5:08] Vadim TKACHENKO
Description:
We have stable crash on next set operations (slave workload)

INSERT INTO t1 (ID, LA, BL) VALUES ('{$ID}', '{$Time}', '{$text}') ON DUPLICATE KEY UPDATE
BL = '{$text}', LA = '{$Time}';

SELECT * FROM t1 WHERE ID = '{$ID}';

DELETE FROM t1 WHERE ID = '{$ID}'
SELECT ID FROM t1 WHERE LA < {$Time1} LIMIT 100000;

against table

CREATE TABLE `t1` (
  `ID` varchar(32) NOT NULL DEFAULT '',
  `LA` int(11) NOT NULL DEFAULT '0',
  `BL` text NOT NULL,
  PRIMARY KEY (`ID`),
  KEY `LA` (`LA`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Message in error.log

InnoDB: Failing assertion: fil_page_get_type(page) == FIL_PAGE_TYPE_BLOB
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/forcing-recovery.html
InnoDB: about forcing recovery.

I can't get trace of crashed thread, but thread with SQL is:

Thread 10 (process 7914):
#0 0x00000036e700d7a8 in pread64 () from /lib64/libpthread.so.0
#1 0x0000000000770f4e in os_file_pread (file=39, buf=0x2aaadced0000, n=16384, offset=<value optimized out>, offset_high=0)
    at os/os0file.c:1998
#2 0x00000000007710a4 in os_file_read (file=39, buf=0x2aaadced0000, offset=1238843392, offset_high=0, n=16384)
    at os/os0file.c:2232
#3 0x0000000000741069 in fil_io (type=<value optimized out>, sync=<value optimized out>, space_id=381, zip_size=0,
---Type <return> to continue, or q <return> to quit---
    block_offset=1238843392, byte_offset=0, len=16384, buf=0x2aaadced0000, message=0x2aaad965d960) at fil/fil0fil.c:4447
#4 0x000000000072b870 in buf_read_page_low (err=0x41b40a20, sync=1, mode=132, space=381, zip_size=0, unzip=<value optimized out>,
    tablespace_version=102, offset=75613) at buf/buf0rea.c:143
#5 0x000000000072c2f7 in buf_read_page (space=381, zip_size=0, offset=75613) at buf/buf0rea.c:353
#6 0x0000000000721abe in buf_page_get_gen (space=381, zip_size=0, offset=75613, rw_latch=1, guess=0x0, mode=10,
    file=0x9257b2 "btr/btr0cur.c", line=<value optimized out>, mtr=0x41b41420) at buf/buf0buf.c:1933
#7 0x000000000070cfa3 in btr_cur_search_to_nth_level (index=0x2aaad910dcb8, level=0, tuple=0x2aaad9387e70, mode=2,
    latch_mode=<value optimized out>, cursor=0x2aaad93890b8, has_search_latch=0, mtr=0x41b41420) at btr/btr0cur.c:465
#8 0x00000000007a0812 in row_search_for_mysql (buf=0xccde510 "", mode=2, prebuilt=0x2aaad9387cb8, match_mode=1, direction=0)
    at ../../storage/innobase/include/btr0pcur.ic:544
#9 0x00000000007028f2 in ha_innobase::index_read (this=0xccde320, buf=0xccde510 "", key_ptr=<value optimized out>, key_len=34,
    find_flag=HA_READ_KEY_EXACT) at handler/ha_innodb.cc:4933
#10 0x0000000000664dff in handler::read_range_first (this=0xccde320, start_key=<value optimized out>,
    end_key=<value optimized out>, eq_range_arg=<value optimized out>, sorted=<value optimized out>) at handler.cc:4155
#11 0x0000000000666c83 in handler::read_multi_range_first (this=0xccde320, found_range_p=0x41b419c0, ranges=0x2aaaac047620,
    range_count=1, sorted=<value optimized out>, buffer=<value optimized out>) at handler.cc:4029
#12 0x00000000006519fd in QUICK_RANGE_SELECT::get_next (this=0x2aaaac00c550) at opt_range.cc:8446
#13 0x00000000006606ed in rr_quick (info=0x41b41a90) at records.cc:313
#14 0x000000000061b453 in mysql_delete (thd=0xcd31520, table_list=0xcd37e98, conds=0xcd383d0, order=<value optimized out>,
    limit=18446744073709551615, options=0, reset_auto_increment=false) at sql_delete.cc:278
#15 0x000000000059a6f7 in mysql_execute_command (thd=0xcd31520) at sql_parse.cc:3216
#16 0x000000000059d0fc in mysql_parse (thd=0xcd31520,
    inBuf=0x2aaaac0078cc "DELETE FROM t1 WHERE ID = 'e2e8cbe26c457db940359cfc7b5c873f'", length=73,
    found_semicolon=0x41b42f78) at sql_parse.cc:5809
#17 0x00000000006455db in Query_log_event::do_apply_event (this=0x2aaaac047540, rli=0xcb75920,
    query_arg=0x2aaaac0078cc "DELETE FROM t1 WHERE ID = 'e2e8cbe26c457db940359cfc7b5c873f'", q_len_arg=73)
    at log_event.cc:3011
#18 0x00000000006a3d63 in apply_event_and_update_pos (ev=0x2aaaac047540, thd=0x0, rli=0xcb75920, skip=true) at log_event.h:1034
#19 0x00000000006a892d in handle_slave_sql (arg=0x1) at slave.cc:2130
#20 0x00000036e70062f7 in start_thread () from /lib64/libpthread.so.0
#21 0x00000036e64d1e3d in ?? ()
#22 0x0000000000000000 in ?? ()

How to repeat:
I have no repeatable test yet, but it crashes constantly on slave box.
[20 Feb 2009 5:14] Vadim TKACHENKO
Crashed thread should be similar to this one
http://forums.innodb.com/read.php?3,114,114#msg-114

#3 0x00002b65709f4d47 in raise () from /lib64/tls/libc.so.6
#4 0x00002b65709f5ff8 in abort () from /lib64/tls/libc.so.6
#5 0x000000000070d5ac in btr_free_externally_stored_field (index=0x26e5f970, field_ref=0x2aaab015d702 "", rec=0x0, offsets=0x0, page_zip=0x0, i=0,
do_not_free_inherited=0, local_mtr=0x44006bb0) at btr/btr0cur.c:4217
#6 0x00000000007940b9 in row_purge_step (thr=0x1088b68) at row/row0purge.c:457
#7 0x000000000078072e in que_run_threads (thr=0x1088b68) at que/que0que.c:1257
#8 0x00000000007ad06e in trx_purge () at trx/trx0purge.c:1125
#9 0x00000000007a54b8 in srv_master_thread (arg=<value optimized out>winking smiley at srv/srv0srv.c:2539
#10 0x00002b65700eaf9f in start_thread () from /lib64/tls/libpthread.so.0
[20 Feb 2009 11:15] Sveta Smirnova
Thank you for the report.

I can not repeat described behavior.

Please provide configuration file you used.

How did  you compile InnoDB Plugin? Which version of MySQL do you use?
[20 Feb 2009 14:11] Mikhail Izioumtchenko
Vadim, could you provide us with the following information:

1. complete error log, if possible
2. thread stack from the failing thread would still be nice. Even better,
stacks from all threads, like what gdb shows with 'thr a a bt'. 
3. 'stable crash', does it mean 'happens every time' or 'heppens most of the time'? Presumably you'd have a testcase then, maybe you could mangle the data, if sensitive, and upload it.
4. Are you completely sure it's 100% Plugin 1.0.2 without any patches?
[20 Feb 2009 14:18] Mikhail Izioumtchenko
and, regarding the resemblance to the forum/Google-reported bug,
how did you create the dataset? was it with 1.0.2, or was an upgrade
involved? Are the versions of mysqld at master and slave different?
[20 Feb 2009 16:29] Vadim TKACHENKO
Michael,

I have limit access to system with crash, let me try to answer what I know, and I will work on repeatable test-case to provide more information.

The stable crash means that slave crashes after some period of time when we put regular load on it.
Crash happens with 
- InnoDB plugin 1.0.2
- With InnoDB plugin 1.0.2 with Percona patches

Crash does not happen with 5.1 with standard InnoDB

Data was created in MySQL 5.0 and then for using with MySQL 5.1 / InnoDB plugin was applied mysql_upgrade script.
[20 Feb 2009 17:57] Mikhail Izioumtchenko
Thanks, Vadim. I'm trying to reproduce something here based on the SQL
pattern in your report, no luck so far. A few things that could be
of interest:
- values of innodb parameters especially innodb_buffer_pool_size, innodb_file_format, innodb_file_per table
- average length of data in the blob column in question
- how many rows are there in the table
- how much time does it take on the average for the system to crash
- SHOW TABLE STATUS for the offending table. My prime interest is the 
row format, dynamic or redundant.
[20 Feb 2009 19:28] Marko Mäkelä
Vadim, can you please get the exact stack trace of the crash?  There were a number of BLOB related bugs in the InnoDB Plugin 1.0.0 and 1.0.1 that I fixed in the 1.0.2 release.

Also, can you please tell if the data files were ever accessed with InnoDB Plugin versions 1.0.0 or 1.0.1. If yes, the files may have been corrupted by the buggy BLOB handling code in InnoDB Plugin 1.0.0 or 1.0.1.

Can you repeat this after starting from the scratch, that is, taking pristine data files from MySQL 5.0 or 5.1 without plugin, and installing InnoDB Plugin 1.0.2 or later?
[23 Feb 2009 1:49] Vadim TKACHENKO
We were able to create repeatable case.

It's very simple, but key thing is - table should be created in 5.0 and INSERT should be executed in 5.1.

So repeatable steps:
Start 5.0 ( I used 5.0.77 without any patches )
execute next statement:

DROP TABLE IF EXISTS `t1`;
CREATE TABLE `t1` (
  `ID` varchar(32) NOT NULL default '',
  `LA` int(11) NOT NULL default '0',
  `TX` text NOT NULL,
  PRIMARY KEY  (`ID`),
  KEY `LA` (`LA`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT;

INSERT INTO `t1` VALUES ('12345678800233445566577686798796',1235177491,repeat('a',15889));

shutdown mysql 5.0 and start mysql 5.1 with InnoDB plugin 1.0.2 on the same datadir.

Execute next query:

INSERT INTO t1 (ID, LA, TX) VALUES ('12345678800233445566577686798796', '1235177510', repeat('b',15889)) ON DUPLICATE KEY UPDATE LA='1235177510', TX=repeat('b',15889);

Query executed without problem, but wait 5 - 60 secs and there is crash in purge thread.

Backtrace in crashed thread:

(gdb) bt
#0  0x0000003acc430155 in raise () from /lib64/libc.so.6
#1  0x0000003acc431bf0 in abort () from /lib64/libc.so.6
#2  0x00000000006d4b57 in btr_free_externally_stored_field (index=0x2aaad3ba4cb8, field_ref=0x2aaad41b8555 "", rec=0x0, 
    offsets=0x0, page_zip=0x0, i=0, rb_ctx=RB_NONE, local_mtr=0x46deeb00) at btr/btr0cur.c:4227
#3  0x000000000076176d in row_purge_step (thr=0x2aaad3ba21b0) at row/row0purge.c:457
#4  0x000000000074d6c6 in que_run_threads (thr=0x2aaad3ba21b0) at que/que0que.c:1257
#5  0x000000000077baca in trx_purge () at trx/trx0purge.c:1125
#6  0x0000000000772fe0 in srv_master_thread (arg=<value optimized out>) at srv/srv0srv.c:2375
#7  0x0000003accc062f7 in start_thread () from /lib64/libpthread.so.0
#8  0x0000003acc4d1e3d in clone () from /lib64/libc.so.6

Backtrace of all threads:

(gdb) thread a a bt

Thread 10 (Thread 1099131200 (LWP 25469)):
#0  0x0000003accc0a496 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000000000573237 in one_thread_per_connection_end (thd=<value optimized out>, put_in_cache=true) at mysqld.cc:1834
#2  0x000000000057ad4f in handle_one_connection (arg=0xd316240) at sql_connect.cc:1122
#3  0x0000003accc062f7 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003acc4d1e3d in clone () from /lib64/libc.so.6

Thread 9 (Thread 1115588928 (LWP 25450)):
#0  0x0000003accc0a496 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000073a625 in os_event_wait_low (event=0xd0b4900, reset_sig_count=<value optimized out>) at os/os0sync.c:422
#2  0x00000000007392e7 in os_aio_simulated_handle (global_segment=0, message1=0x427e80f0, message2=0x427e80e8, type=0x427e80e0)
    at os/os0file.c:3933
#3  0x0000000000708173 in fil_aio_wait (segment=0) at fil/fil0fil.c:4500
#4  0x0000000000775348 in io_handler_thread (arg=<value optimized out>) at srv/srv0start.c:433
#5  0x0000003accc062f7 in start_thread () from /lib64/libpthread.so.0
#6  0x0000003acc4d1e3d in clone () from /lib64/libc.so.6

Thread 8 (Thread 1126078784 (LWP 25451)):
#0  0x0000003accc0a496 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000073a625 in os_event_wait_low (event=0xd0b49a0, reset_sig_count=<value optimized out>) at os/os0sync.c:422
#2  0x00000000007392e7 in os_aio_simulated_handle (global_segment=1, message1=0x431e90f0, message2=0x431e90e8, type=0x431e90e0)
    at os/os0file.c:3933
#3  0x0000000000708173 in fil_aio_wait (segment=1) at fil/fil0fil.c:4500
#4  0x0000000000775348 in io_handler_thread (arg=<value optimized out>) at srv/srv0start.c:433
#5  0x0000003accc062f7 in start_thread () from /lib64/libpthread.so.0
#6  0x0000003acc4d1e3d in clone () from /lib64/libc.so.6

Thread 7 (Thread 1136568640 (LWP 25452)):
#0  0x0000003accc0a496 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000073a625 in os_event_wait_low (event=0xd0b4a40, reset_sig_count=<value optimized out>) at os/os0sync.c:422
#2  0x00000000007392e7 in os_aio_simulated_handle (global_segment=2, message1=0x43bea0f0, message2=0x43bea0e8, type=0x43bea0e0)
    at os/os0file.c:3933
#3  0x0000000000708173 in fil_aio_wait (segment=2) at fil/fil0fil.c:4500
#4  0x0000000000775348 in io_handler_thread (arg=<value optimized out>) at srv/srv0start.c:433
#5  0x0000003accc062f7 in start_thread () from /lib64/libpthread.so.0
#6  0x0000003acc4d1e3d in clone () from /lib64/libc.so.6

Thread 6 (Thread 1147058496 (LWP 25453)):
#0  0x0000003accc0a496 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000073a625 in os_event_wait_low (event=0xd0b4ae0, reset_sig_count=<value optimized out>) at os/os0sync.c:422
#2  0x00000000007392e7 in os_aio_simulated_handle (global_segment=3, message1=0x445eb0f0, message2=0x445eb0e8, type=0x445eb0e0)
    at os/os0file.c:3933
#3  0x0000000000708173 in fil_aio_wait (segment=3) at fil/fil0fil.c:4500
#4  0x0000000000775348 in io_handler_thread (arg=<value optimized out>) at srv/srv0start.c:433
#5  0x0000003accc062f7 in start_thread () from /lib64/libpthread.so.0
#6  0x0000003acc4d1e3d in clone () from /lib64/libc.so.6

Thread 5 (Thread 1168038208 (LWP 25455)):
#0  0x0000003acc4cb372 in select () from /lib64/libc.so.6
#1  0x000000000073a97f in os_thread_sleep (tm=<value optimized out>) at os/os0thread.c:272
#2  0x0000000000774275 in srv_lock_timeout_and_monitor_thread (arg=<value optimized out>) at srv/srv0srv.c:1890
#3  0x0000003accc062f7 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003acc4d1e3d in clone () from /lib64/libc.so.6

Thread 4 (Thread 1178528064 (LWP 25456)):
#0  0x0000003acc4cb372 in select () from /lib64/libc.so.6
---Type <return> to continue, or q <return> to quit---
#1  0x000000000073a97f in os_thread_sleep (tm=<value optimized out>) at os/os0thread.c:272
#2  0x0000000000774895 in srv_error_monitor_thread (arg=<value optimized out>) at srv/srv0srv.c:2123
#3  0x0000003accc062f7 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003acc4d1e3d in clone () from /lib64/libc.so.6

Thread 3 (Thread 1189017920 (LWP 25457)):
#0  0x0000003acc430155 in raise () from /lib64/libc.so.6
#1  0x0000003acc431bf0 in abort () from /lib64/libc.so.6
#2  0x00000000006d4b57 in btr_free_externally_stored_field (index=0x2aaad3ba4cb8, field_ref=0x2aaad41b8555 "", rec=0x0, 
    offsets=0x0, page_zip=0x0, i=0, rb_ctx=RB_NONE, local_mtr=0x46deeb00) at btr/btr0cur.c:4227
#3  0x000000000076176d in row_purge_step (thr=0x2aaad3ba21b0) at row/row0purge.c:457
#4  0x000000000074d6c6 in que_run_threads (thr=0x2aaad3ba21b0) at que/que0que.c:1257
#5  0x000000000077baca in trx_purge () at trx/trx0purge.c:1125
#6  0x0000000000772fe0 in srv_master_thread (arg=<value optimized out>) at srv/srv0srv.c:2375
#7  0x0000003accc062f7 in start_thread () from /lib64/libpthread.so.0
#8  0x0000003acc4d1e3d in clone () from /lib64/libc.so.6

Thread 2 (Thread 1079974208 (LWP 25458)):
#0  0x0000003accc0dba8 in do_sigwait () from /lib64/libpthread.so.0
#1  0x0000003accc0dc4d in sigwait () from /lib64/libpthread.so.0
#2  0x0000000000572f8b in signal_hand (arg=<value optimized out>) at mysqld.cc:2710
#3  0x0000003accc062f7 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003acc4d1e3d in clone () from /lib64/libc.so.6

Thread 1 (Thread 47479261127856 (LWP 25448)):
#0  0x0000003acc4cb372 in select () from /lib64/libc.so.6
#1  0x00000000005738b9 in handle_connections_sockets (arg=<value optimized out>) at mysqld.cc:4880
#2  0x0000000000577837 in main (argc=<value optimized out>, argv=<value optimized out>) at mysqld.cc:4379
#0  0x0000003acc430155 in raise () from /lib64/libc.so.6
[23 Feb 2009 11:22] Marko Mäkelä
Thank you, Vadim, for the repeatable test case. I can repeat it, and I am investigating the bug now.
[23 Feb 2009 11:55] Marko Mäkelä
Vadim, the problem is a too strict check in the InnoDB Plugin.

Before MySQL 5.1.x (the source code says 5.1.7, but I haven't checked when the code was actually merged to the MySQL tree), the FIL_PAGE_TYPE field of BLOB pages was garbage.

The simple fix is to comment out the failing assertion and possibly other similar assertions:
ut_a(fil_page_get_type(page) == FIL_PAGE_TYPE_BLOB);

I'll try to get this fixed in InnoDB Plugin 1.0.3. I will see if it is feasible to relax the assertion only for the Antelope format. In the Barracuda format, the file page type should always be correct.
[23 Feb 2009 17:24] Vadim TKACHENKO
Marko,

Great, thank you! We also figured out this is problem related to 5.0->5.1-plugin BLOB compatibility, I hope you will find a good solution. Btw, any ETA for InnoDB-1.0.3-plugin ?
[23 Feb 2009 17:37] Ken Jacobs
We expect to release InnoDB Plugin 1.0.3 "soon".  ;-)
[17 Mar 2009 11:00] Marko Mäkelä
InnoDB Plugin 1.0.3 was released on March 11, 2009.
[5 May 2010 15:28] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 17:55] Paul Dubois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[28 May 2010 5:55] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:24] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:52] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[30 May 2010 0:04] Paul Dubois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[15 Jun 2010 8:22] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100615080459-smuswd9ooeywcxuc) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (merge vers: 5.1.47) (pib:16)
[15 Jun 2010 8:40] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100615080558-cw01bzdqr1bdmmec) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (pib:16)
[17 Jun 2010 12:17] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:04] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:44] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)