Bug #33524 mysqld dies with signal 11 on INSERT
Submitted: 27 Dec 2007 5:47 Modified: 9 Sep 2008 6:46
Reporter: Keith Pyle Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: General Severity:S2 (Serious)
Version:5.0.44 OS:Linux (Gentoo 2.6.18)
Assigned to: CPU Architecture:Any
Tags: crash, segfault

[27 Dec 2007 5:47] Keith Pyle
Description:
mysqld dies with a signal 11 on the INSERT command shown below.  The command is generated by mythtv's mythfilldatabase command during a data update and is operating on a set of temporary tables.

INSERT IGNORE INTO credits (chanid, starttime, person, role) SELECT dd_v_program.chanid, DATE_ADD(starttime, INTERVAL channel. tmoffset MINUTE), person, role FROM dd_productioncrew, dd_v_program, channel, people WHERE ((dd_productioncrew.programid = dd_v_program.programid) AND        (dd_productioncrew.fullname = people.name)) AND dd_v_program.chanid = channel.  chanid

mysqld.err contains the following:
071226 22:02:36 - mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=16777216
read_buffer_size=258048
max_used_connections=10
max_connections=100
threads_connected=9
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 92780 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x8be6808
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0xb0c62918, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x819411f
0x817a4b5
0x81f23e9
0x81f4f7e
0x81f6064
0x8202ac1
0x8203556
0x81b0c36
0x81b15c6
0x81b1b99
0x81b322e
0x81b3ff0
0xb7d762c1
0xb7bcff9e
New value of fp=(nil) failed sanity check, terminating stack trace!
...
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x8c88130 = INSERT IGNORE INTO credits (chanid, starttime,        person, role) SELECT dd_v_program.chanid, DATE_ADD(starttime, INTERVAL channel. tmoffset MINUTE), person, role FROM dd_productioncrew, dd_v_program, channel,   people WHERE ((dd_productioncrew.programid = dd_v_program.programid) AND        (dd_productioncrew.fullname = people.name)) AND dd_v_program.chanid = channel.  chanid
thd->thread_id=261

The stack trace resolves to:
0x819411f handle_segfault + 773
0x817a4b5 Field_string::type() const + 21
0x81f23e9 st_select_lex::print(THD*, String*) + 4125
0x81f4f7e st_select_lex::print(THD*, String*) + 15282
0x81f6064 JOIN::optimize() + 1366
0x8202ac1 mysql_select(THD*, Item***, st_table_list*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_ord + 161
0x8203556 handle_select(THD*, st_lex*, select_result*, unsigned long) + 468
0x81b0c36 mysql_execute_command(THD*) + 30350
0x81b15c6 mysql_parse(THD*, char const*, unsigned int, char const**) + 592
0x81b1b99 dispatch_command(enum_server_command, THD*, char*, unsigned int) + 1405
0x81b322e do_command(THD*) + 278
0x81b3ff0 handle_one_connection + 3082
0xb7d762c1 _end + -1351624687
0xb7bcff9e _end + -1353354002

How to repeat:
The crash does not occur on every instance of the mythfilldatabase command, but roughly once per week where mythfilldatabase is run automatically once per day.  However, the crash is always on the above INSERT command on each occurrence.
[27 Dec 2007 9:12] Valeriy Kravchuk
Thank you for a problem report. Please, send the results of:

EXPLAIN SELECT dd_v_program.chanid,
DATE_ADD(starttime, INTERVAL channel. tmoffset MINUTE), person, role FROM
dd_productioncrew, dd_v_program, channel, people WHERE ((dd_productioncrew.programid =
dd_v_program.programid) AND        (dd_productioncrew.fullname = people.name)) AND
dd_v_program.chanid = channel.chanid\G

if the above is a complete statement. Sent the my.cnf file content also.
[27 Dec 2007 16:41] Valeriy Kravchuk
Please, send the results when you'll be able to get them.
[29 Dec 2007 0:35] Keith Pyle
Following another crash, I was able to collect the temporary files, identify which ones were the tables being referenced by the INSERT command, rename and insert them to be recognized within this database.  Here is the result of the EXPLAIN command you requested:

mysql> explain SELECT dd_v_program.chanid, DATE_ADD(starttime, INTERVAL channel.tmoffset MINUTE), person, role FROM dd_productioncrew, dd_v_program, channel, people WHERE ((dd_productioncrew.programid = dd_v_program.programid) AND (dd_productioncrew.fullname = people.name)) AND dd_v_program.chanid = channel.chanid;
+----+-------------+-------------------+--------+-----------------+---------+---------+-----------------------------------------+------+-------------+
| id | select_type | table             | type   | possible_keys   | key     | key_len | ref                                     | rows | Extra       |
+----+-------------+-------------------+--------+-----------------+---------+---------+-----------------------------------------+------+-------------+
|  1 | SIMPLE      | dd_productioncrew | ALL    | progidx,nameidx | NULL    | NULL    | NULL                                    | 3321 |             | 
|  1 | SIMPLE      | dd_v_program      | ref    | progidx         | progidx | 41      | mythconverg.dd_productioncrew.programid |    1 | Using where | 
|  1 | SIMPLE      | channel           | eq_ref | PRIMARY         | PRIMARY | 4       | mythconverg.dd_v_program.chanid         |    1 |             | 
|  1 | SIMPLE      | people            | eq_ref | name            | name    | 41      | mythconverg.dd_productioncrew.fullname  |    1 | Using where | 
+----+-------------+-------------------+--------+-----------------+---------+---------+-----------------------------------------+------+-------------+
4 rows in set (0.00 sec)

Also, I was able to collect a core file from mysqld.  However, this binary does not have symbols and the usefulness is limited.  If there is something you want me to examine in it, let me know.
[6 Jan 2008 18:53] Keith Pyle
We have attempted to gather additional debugging information, but were unsuccessful.  We rebuilt mysqld with symbols.  Following the debugging instructions web page, I modified the mysqld start-up to include the following options:

--gdb --skip-stack-trace --thread_cache_size=5 --log

and attached gdb to the process.  gdb is version 6.7.1.  We were able to catch a SEGFAULT with gdb, but the stack trace was useless:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb0bcfb90 (LWP 4838)]
0x081903f5 in Field_string::type ()
(gdb) bt
#0  0x081903f5 in Field_string::type ()
#1  0x08208329 in ?? ()
#2  0x08c85e10 in ?? ()
#3  0x00000004 in ?? ()
#4  0x00000060 in ?? ()
#5  0xb0bcdda4 in ?? ()
#6  0xb0bcdda0 in ?? ()
#7  0xb0bcdd9c in ?? ()
#8  0xb0bcdd98 in ?? ()
#9  0x0100003b in ?? ()
#10 0xb0bcdf9c in ?? ()
#11 0xb0bcdfa8 in ?? ()
#12 0x08c34460 in ?? ()
#13 0x01000000 in ?? ()
#14 0x00000001 in ?? ()
#15 0x00000001 in ?? ()
#16 0x00000000 in ?? ()

The last item in the mysqld.log was the same query as in previous crashes:
                    459 Query       INSERT IGNORE INTO credits (chanid, starttime, person, role) SELECT dd_v_program.chanid, DATE_ADD(starttime, INTERVAL channel.tmoffset MINUTE), person, role FROM dd_productioncrew, dd_v_program, channel, people WHERE ((dd_productioncrew.programid = dd_v_program.programid) AND (dd_productioncrew.fullname = people.name)) AND dd_v_program.chanid = channel.chanid

I'm open to suggestions for further debugging.  However, the continued 5.0.44 instability is a problem.  We're giving serious consideration to reverting to 4.1 where this problem did not occur in over two years of operation with the same applications.

Would there be any benefit to:
1) running with --one-thread?
2) turning off the --with-debug=full build option?

Do you have any suggestions to modify the build, command line options, or gdb use to get a useable stack trace?  We are *not* passing -fomit-frame-pointer from the Gentoo build system.
[10 Jan 2008 0:34] Keith Pyle
A successful stack trace of the crash with symbols!

#0  0x081903f5 in Field_string::type (this=0x8c3e9e0) at field.h:1076
#1  0x08208329 in create_ref_for_key (join=0x8c0fc18, j=0x8c117b4, org_keyuse=0x
8c582f0,
    used_tables=4611686018427387919) at sql_select.h:505
#2  0x0820aebe in make_join_statistics (join=0x8c0fc18, tables=0x0, conds=0x8cfd
cd8,
    keyuse_array=0x8c1091c) at sql_select.cc:5135
#3  0x0820bfa4 in JOIN::optimize (this=0x8c0fc18) at sql_select.cc:897
#4  0x08218a01 in mysql_select (thd=0x8c5c7c0, rref_pointer_array=0x8c5d858,
    tables=0x8cfcd60, wild_num=0, fields=@0x8c5d7c8, conds=0x8cfdcd8, og_num=0,
    order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=147183704,
    result=0x8cfdda8, unit=0x8c5d504, select_lex=0x8c5d734) at sql_select.cc:223
4
#5  0x08219496 in handle_select (thd=0x8c5c7c0, lex=0x8c5d4ac, result=0x8cfdda8,
    setup_tables_done_option=1073741824) at sql_select.cc:255
#6  0x081c6b76 in mysql_execute_command (thd=0x8c5c7c0) at sql_parse.cc:3571
#7  0x081c7506 in mysql_parse (thd=0x8c5c7c0,
    inBuf=0x8cfc1d0 "INSERT IGNORE INTO credits (chanid, starttime, person, role
) SELECT dd_v_program.chanid, DATE_ADD(starttime, INTERVAL channel.tmoffset MINU
TE), person, role FROM dd_productioncrew, dd_v_program, chann"..., length=361, f
ound_semicolon=0xb0ca4f74)
    at sql_parse.cc:6068
#8  0x081c7ad9 in dispatch_command (command=COM_QUERY, thd=0x8c5c7c0,
    packet=0x8c48649 "INSERT IGNORE INTO credits (chanid, starttime, person, rol
e) SELECT dd_v_program.chanid, DATE_ADD(starttime, INTERVAL channel.tmoffset MIN
UTE), person, role FROM dd_productioncrew, dd_v_program, chann"..., packet_lengt
h=363) at sql_parse.cc:1812
#9  0x081c916e in do_command (thd=0x8c5c7c0) at sql_parse.cc:1586
#10 0x081c9f30 in handle_one_connection (arg=0x8c0fbd0) at sql_parse.cc:1197
#11 0xb7d862c1 in start_thread (arg=0xb0ca5b90) at pthread_create.c:296
#12 0xb7bdff9e in clone () from /lib/libc.so.6

We have saved the core file from this crash and can answer further questions if desired.
[11 Jan 2008 4:09] Keith Pyle
We examined the core file state in the function that crashed.  Here are the contents of *this:

(gdb) p *this
$2 = {<Field_longstr> = {<Field_str> = {<Field> = {_vptr.Field = 0x84b9608,
        ptr = 0x8bd98cc "Zoe McLellan", ' ' <repeats 116 times>, '¥' <repeats 72 
times>..., null_ptr = 0x0, table = 0x8cf9a68, orig_table = 0xb0bdfb10, table_nam
e = 0xb0bdfb74,
        field_name = 0x8bd9a10 "name", comment = {str = 0x8515f3c "", length = 0
},
        query_id = 0, key_start = {map = 0}, part_of_key = {map = 0}, part_of_so
rtkey = {
          map = 0}, unireg_check = NONE, field_length = 41, field_index = 1, fla
gs = 1,
        null_bit = 0 '\0'}, field_charset = 0x866a200,
      field_derivation = DERIVATION_IMPLICIT}, <No data fields>},
  can_alter_field_type = true}

The original expression dereferences orig_table.  Here are the first few rows:

(gdb) p *this->orig_table
$3 = {s = 0xffffffff, file = 0xffffffff, next = 0xb0bdfb2c, prev = 0x800, in_use
 = 0x0,
  field = 0xb0be1b68, record = {0x15ec <Address 0x15ec out of bounds>, 0x0},
  insert_values = 0x2c00 <Address 0x2c00 out of bounds>, quick_keys = {map = 147
538240},
  used_keys = {map = 5367660544}, keys_in_use_for_query = {map = 633363640136761
344},
  merge_keys = {map = 630681209868955712}, key_info = 0x55,
  next_number_field = 0xb7c43ff4, found_next_number_field = 0xb7c45120,
  rowid_field = 0x370, timestamp_field = 0x0, triggers = 0x0,
  pos_in_table_list = 0xd8837e, group = 0xffffffff,
  alias = 0x7fefffff <Address 0x7fefffff out of bounds>,
  null_flags = 0xb0bdfbc8 "\210ü½°à(Ê\b\224ÿ½°", query_id = 57995081851903,

Note that "s" is not a valid pointer in orig_table.  There are other values that appear to be invalid (e.g., prev = 0x800, record = 0x15ec).  We wonder if this could result from orig_table being a reference to previously freed memory.

We attempted to run this operation on mysqld under valgrind.  However, the mysql memory manager allocated one large block and therefore valgrind could not inspect individual allocations.
[30 Jan 2008 0:26] Keith Pyle
On suggestions from the mythtv developer forum, we tried dropping the index for the credits table.  The update crashed on its first run.  We recreated the index and got another crash on the update run.  Both crashes had the same stack trace.

We next tried converting the credits table from MyISAM to InnoDB.  The update run crashed on the first run.  It also had the same stack trace.  This would seem to reduce the liklihood that the crash was either due to an index problem in the credit table or linked to the DB engine being used for the table.
[9 Mar 2008 23:32] Keith Pyle
We tried a number of other 5.0.x versions and finally upgraded to 5.0.54.  We continued to see the issue with all versions.  We then converted the tables in this DB from MyISAM to InnoDB.  We have *not* seen the issue for over two weeks.

In summary, this problem appears with multiple 5.0.x versions and MyISAM tables, but not with InnoDB tables at 5.0.54.
[9 May 2008 17:02] Ben P.
I recently started getting this bug.  Upgrading/downgrading mythtv and mysql does not seem to help.  Currently using mysql-5.0.56 from the Gentoo repositories on AMD64.  It happens almost daily for me.  I may try changing to InnoDB, however that would require changing by current backup scheme that uses mysqlhotcopy.
[9 Aug 2008 6:46] Valeriy Kravchuk
Please, try to repeat with a newer version, 5.0.67, and inform about the results.
[9 Sep 2008 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".