| Bug #33524 | mysqld dies with signal 11 on INSERT | ||
|---|---|---|---|
| Submitted: | 27 Dec 2007 5:47 | Modified: | 9 Sep 2008 6:46 |
| Reporter: | Keith Pyle | Email Updates: | |
| Status: | No Feedback | Impact on me: | |
| Category: | MySQL Server: General | Severity: | S2 (Serious) |
| Version: | 5.0.44 | OS: | Linux (Gentoo 2.6.18) |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | crash, segfault | ||
[27 Dec 2007 5:47]
Keith Pyle
[27 Dec 2007 9:12]
Valeriy Kravchuk
Thank you for a problem report. Please, send the results of: EXPLAIN SELECT dd_v_program.chanid, DATE_ADD(starttime, INTERVAL channel. tmoffset MINUTE), person, role FROM dd_productioncrew, dd_v_program, channel, people WHERE ((dd_productioncrew.programid = dd_v_program.programid) AND (dd_productioncrew.fullname = people.name)) AND dd_v_program.chanid = channel.chanid\G if the above is a complete statement. Sent the my.cnf file content also.
[27 Dec 2007 16:41]
Valeriy Kravchuk
Please, send the results when you'll be able to get them.
[29 Dec 2007 0:35]
Keith Pyle
Following another crash, I was able to collect the temporary files, identify which ones were the tables being referenced by the INSERT command, rename and insert them to be recognized within this database. Here is the result of the EXPLAIN command you requested: mysql> explain SELECT dd_v_program.chanid, DATE_ADD(starttime, INTERVAL channel.tmoffset MINUTE), person, role FROM dd_productioncrew, dd_v_program, channel, people WHERE ((dd_productioncrew.programid = dd_v_program.programid) AND (dd_productioncrew.fullname = people.name)) AND dd_v_program.chanid = channel.chanid; +----+-------------+-------------------+--------+-----------------+---------+---------+-----------------------------------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------------+--------+-----------------+---------+---------+-----------------------------------------+------+-------------+ | 1 | SIMPLE | dd_productioncrew | ALL | progidx,nameidx | NULL | NULL | NULL | 3321 | | | 1 | SIMPLE | dd_v_program | ref | progidx | progidx | 41 | mythconverg.dd_productioncrew.programid | 1 | Using where | | 1 | SIMPLE | channel | eq_ref | PRIMARY | PRIMARY | 4 | mythconverg.dd_v_program.chanid | 1 | | | 1 | SIMPLE | people | eq_ref | name | name | 41 | mythconverg.dd_productioncrew.fullname | 1 | Using where | +----+-------------+-------------------+--------+-----------------+---------+---------+-----------------------------------------+------+-------------+ 4 rows in set (0.00 sec) Also, I was able to collect a core file from mysqld. However, this binary does not have symbols and the usefulness is limited. If there is something you want me to examine in it, let me know.
[6 Jan 2008 18:53]
Keith Pyle
We have attempted to gather additional debugging information, but were unsuccessful. We rebuilt mysqld with symbols. Following the debugging instructions web page, I modified the mysqld start-up to include the following options:
--gdb --skip-stack-trace --thread_cache_size=5 --log
and attached gdb to the process. gdb is version 6.7.1. We were able to catch a SEGFAULT with gdb, but the stack trace was useless:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb0bcfb90 (LWP 4838)]
0x081903f5 in Field_string::type ()
(gdb) bt
#0 0x081903f5 in Field_string::type ()
#1 0x08208329 in ?? ()
#2 0x08c85e10 in ?? ()
#3 0x00000004 in ?? ()
#4 0x00000060 in ?? ()
#5 0xb0bcdda4 in ?? ()
#6 0xb0bcdda0 in ?? ()
#7 0xb0bcdd9c in ?? ()
#8 0xb0bcdd98 in ?? ()
#9 0x0100003b in ?? ()
#10 0xb0bcdf9c in ?? ()
#11 0xb0bcdfa8 in ?? ()
#12 0x08c34460 in ?? ()
#13 0x01000000 in ?? ()
#14 0x00000001 in ?? ()
#15 0x00000001 in ?? ()
#16 0x00000000 in ?? ()
The last item in the mysqld.log was the same query as in previous crashes:
459 Query INSERT IGNORE INTO credits (chanid, starttime, person, role) SELECT dd_v_program.chanid, DATE_ADD(starttime, INTERVAL channel.tmoffset MINUTE), person, role FROM dd_productioncrew, dd_v_program, channel, people WHERE ((dd_productioncrew.programid = dd_v_program.programid) AND (dd_productioncrew.fullname = people.name)) AND dd_v_program.chanid = channel.chanid
I'm open to suggestions for further debugging. However, the continued 5.0.44 instability is a problem. We're giving serious consideration to reverting to 4.1 where this problem did not occur in over two years of operation with the same applications.
Would there be any benefit to:
1) running with --one-thread?
2) turning off the --with-debug=full build option?
Do you have any suggestions to modify the build, command line options, or gdb use to get a useable stack trace? We are *not* passing -fomit-frame-pointer from the Gentoo build system.
[10 Jan 2008 0:34]
Keith Pyle
A successful stack trace of the crash with symbols!
#0 0x081903f5 in Field_string::type (this=0x8c3e9e0) at field.h:1076
#1 0x08208329 in create_ref_for_key (join=0x8c0fc18, j=0x8c117b4, org_keyuse=0x
8c582f0,
used_tables=4611686018427387919) at sql_select.h:505
#2 0x0820aebe in make_join_statistics (join=0x8c0fc18, tables=0x0, conds=0x8cfd
cd8,
keyuse_array=0x8c1091c) at sql_select.cc:5135
#3 0x0820bfa4 in JOIN::optimize (this=0x8c0fc18) at sql_select.cc:897
#4 0x08218a01 in mysql_select (thd=0x8c5c7c0, rref_pointer_array=0x8c5d858,
tables=0x8cfcd60, wild_num=0, fields=@0x8c5d7c8, conds=0x8cfdcd8, og_num=0,
order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=147183704,
result=0x8cfdda8, unit=0x8c5d504, select_lex=0x8c5d734) at sql_select.cc:223
4
#5 0x08219496 in handle_select (thd=0x8c5c7c0, lex=0x8c5d4ac, result=0x8cfdda8,
setup_tables_done_option=1073741824) at sql_select.cc:255
#6 0x081c6b76 in mysql_execute_command (thd=0x8c5c7c0) at sql_parse.cc:3571
#7 0x081c7506 in mysql_parse (thd=0x8c5c7c0,
inBuf=0x8cfc1d0 "INSERT IGNORE INTO credits (chanid, starttime, person, role
) SELECT dd_v_program.chanid, DATE_ADD(starttime, INTERVAL channel.tmoffset MINU
TE), person, role FROM dd_productioncrew, dd_v_program, chann"..., length=361, f
ound_semicolon=0xb0ca4f74)
at sql_parse.cc:6068
#8 0x081c7ad9 in dispatch_command (command=COM_QUERY, thd=0x8c5c7c0,
packet=0x8c48649 "INSERT IGNORE INTO credits (chanid, starttime, person, rol
e) SELECT dd_v_program.chanid, DATE_ADD(starttime, INTERVAL channel.tmoffset MIN
UTE), person, role FROM dd_productioncrew, dd_v_program, chann"..., packet_lengt
h=363) at sql_parse.cc:1812
#9 0x081c916e in do_command (thd=0x8c5c7c0) at sql_parse.cc:1586
#10 0x081c9f30 in handle_one_connection (arg=0x8c0fbd0) at sql_parse.cc:1197
#11 0xb7d862c1 in start_thread (arg=0xb0ca5b90) at pthread_create.c:296
#12 0xb7bdff9e in clone () from /lib/libc.so.6
We have saved the core file from this crash and can answer further questions if desired.
[11 Jan 2008 4:09]
Keith Pyle
We examined the core file state in the function that crashed. Here are the contents of *this:
(gdb) p *this
$2 = {<Field_longstr> = {<Field_str> = {<Field> = {_vptr.Field = 0x84b9608,
ptr = 0x8bd98cc "Zoe McLellan", ' ' <repeats 116 times>, '¥' <repeats 72
times>..., null_ptr = 0x0, table = 0x8cf9a68, orig_table = 0xb0bdfb10, table_nam
e = 0xb0bdfb74,
field_name = 0x8bd9a10 "name", comment = {str = 0x8515f3c "", length = 0
},
query_id = 0, key_start = {map = 0}, part_of_key = {map = 0}, part_of_so
rtkey = {
map = 0}, unireg_check = NONE, field_length = 41, field_index = 1, fla
gs = 1,
null_bit = 0 '\0'}, field_charset = 0x866a200,
field_derivation = DERIVATION_IMPLICIT}, <No data fields>},
can_alter_field_type = true}
The original expression dereferences orig_table. Here are the first few rows:
(gdb) p *this->orig_table
$3 = {s = 0xffffffff, file = 0xffffffff, next = 0xb0bdfb2c, prev = 0x800, in_use
= 0x0,
field = 0xb0be1b68, record = {0x15ec <Address 0x15ec out of bounds>, 0x0},
insert_values = 0x2c00 <Address 0x2c00 out of bounds>, quick_keys = {map = 147
538240},
used_keys = {map = 5367660544}, keys_in_use_for_query = {map = 633363640136761
344},
merge_keys = {map = 630681209868955712}, key_info = 0x55,
next_number_field = 0xb7c43ff4, found_next_number_field = 0xb7c45120,
rowid_field = 0x370, timestamp_field = 0x0, triggers = 0x0,
pos_in_table_list = 0xd8837e, group = 0xffffffff,
alias = 0x7fefffff <Address 0x7fefffff out of bounds>,
null_flags = 0xb0bdfbc8 "\210ü½°à(Ê\b\224ÿ½°", query_id = 57995081851903,
Note that "s" is not a valid pointer in orig_table. There are other values that appear to be invalid (e.g., prev = 0x800, record = 0x15ec). We wonder if this could result from orig_table being a reference to previously freed memory.
We attempted to run this operation on mysqld under valgrind. However, the mysql memory manager allocated one large block and therefore valgrind could not inspect individual allocations.
[30 Jan 2008 0:26]
Keith Pyle
On suggestions from the mythtv developer forum, we tried dropping the index for the credits table. The update crashed on its first run. We recreated the index and got another crash on the update run. Both crashes had the same stack trace. We next tried converting the credits table from MyISAM to InnoDB. The update run crashed on the first run. It also had the same stack trace. This would seem to reduce the liklihood that the crash was either due to an index problem in the credit table or linked to the DB engine being used for the table.
[9 Mar 2008 23:32]
Keith Pyle
We tried a number of other 5.0.x versions and finally upgraded to 5.0.54. We continued to see the issue with all versions. We then converted the tables in this DB from MyISAM to InnoDB. We have *not* seen the issue for over two weeks. In summary, this problem appears with multiple 5.0.x versions and MyISAM tables, but not with InnoDB tables at 5.0.54.
[9 May 2008 17:02]
Ben P.
I recently started getting this bug. Upgrading/downgrading mythtv and mysql does not seem to help. Currently using mysql-5.0.56 from the Gentoo repositories on AMD64. It happens almost daily for me. I may try changing to InnoDB, however that would require changing by current backup scheme that uses mysqlhotcopy.
[9 Aug 2008 6:46]
Valeriy Kravchuk
Please, try to repeat with a newer version, 5.0.67, and inform about the results.
[9 Sep 2008 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
