Bug #31569 Server crash with signal 11 after 23 hours of running system test
Submitted: 12 Oct 2007 16:03 Modified: 8 Dec 2008 16:26
Reporter: Omer Barnir (OCA) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S1 (Critical)
Version:5.0.50 OS:Linux
Assigned to: Gleb Shchepa CPU Architecture:Any
Tags: simpler_testcase_needed

[12 Oct 2007 16:03] Omer Barnir
Description:
The MySQL server crashed after a 23 hour run of the system test. The crash cannon be traced to a specific query/action. The same scenarios of insert/update/release are running continuesly during the test.

master.err log file did not include a stack trace but had the following:

Version: '5.0.50-enterprise-log'  socket: '/home/qauser/bin64_5050/mysql-test/var/tmp/master.sock'  port: 9306  MySQL Enterprise Server (Commercial)
071012  1:06:12 - mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=61
max_connections=100
threads_connected=52
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 39423 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x2aaac003a610
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x446c9068, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
(nil)
Stack trace seems successful - bottom reached
Please read http://dev.mysql.com/doc/mysql/en/using-stack-trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x1d02c130 = UPDATE systest1.tb1_eng1 target
SET i1 = (SELECT new_i1 FROM systest1.t1_tmp source WHERE source.i1 = target.i1),
f1 = @connection_id,
f2 = @operation,
f3 = ROUND(i1/@max_val,3),
f4 = @my_now
WHERE i1 = @next_val + 20 - @no_of_tries DIV 2
thd->thread_id=481807

Running gdb on the mysqld binary using the core file generated:

#0  0x00000033c040b002 in pthread_kill () from /lib64/libpthread.so.0
#1  0x000000000056f176 in handle_segfault ()
#2  <signal handler called>
#3  0x0000000000000000 in ?? ()
#4  0x00000000005ce3ca in test_if_ref ()
#5  0x00000000005c6619 in make_cond_for_table ()
#6  0x00000000005c2354 in make_join_select ()
#7  0x00000000005b724f in JOIN::optimize ()
#8  0x0000000000544fe5 in subselect_single_select_engine::exec ()
#9  0x0000000000540638 in Item_subselect::exec ()
#10 0x0000000000540f0d in Item_singlerow_subselect::val_int ()
#11 0x00000000004f1c19 in Item::save_in_field ()
#12 0x00000000005ae57f in fill_record ()
#13 0x00000000005aceb6 in fill_record_n_invoke_before_triggers ()
#14 0x00000000005e26f6 in mysql_update ()
#15 0x0000000000585a78 in mysql_execute_command ()
#16 0x000000000058b10c in mysql_parse ()
#17 0x0000000000583734 in dispatch_command ()
#18 0x000000000058327f in do_command ()
#19 0x0000000000582945 in handle_one_connection ()
#20 0x00000033c04061b5 in start_thread () from /lib64/libpthread.so.0
#21 0x00000033bf8cd39d in clone () from /lib64/libc.so.6
#22 0x0000000000000000 in ?? ()

CPU and memory usage during the crash were normal and no spikes were observed.

How to repeat:
Run the QA 'Systems' test (iuds2_ddl scenario).

Currently re-running the test to see if the crash is consistent or not

Suggested fix:
No crash
[14 Dec 2007 22:44] Philip Stoev
It appears that this bug remains in 5.0.50SP1

Here is the stack trace:
0x818b595 handle_segfault + 417
0x81e52de _Z11test_if_refP10Item_fieldP4Item + 78
0x81dd852 _Z19make_cond_for_tableP4Itemyy + 258
0x81dda79 _Z19make_cond_for_tableP4Itemyy + 809
0x81d95ae _Z16make_join_selectP4JOINP10SQL_SELECTP4Item + 362
0x81ce930 _ZN4JOIN8optimizeEv + 1692
0x8164e11 _ZN30subselect_single_select_engine4execEv + 713
0x8160a90 _ZN14Item_subselect4execEv + 44
0x8161b8f _ZN17Item_in_subselect8val_boolEv + 23
0x812245f _ZN4Item15val_bool_resultEv + 15
0x813e0c7 _ZN17Item_in_optimizer7val_intEv + 387
0x81f8975 _Z12mysql_updateP3THDP10TABLE_LISTR4ListI4ItemES6_PS4_jP8st_ordery15enum_duplicatesb + 3525
0x81a136b _Z21mysql_execute_commandP3THD + 3711
0x81a6ffd _Z11mysql_parseP3THDPKcjPS2_ + 241
0x819f0fa _Z16dispatch_command19enum_server_commandP3THDPcj + 1198
0x819ec10 _Z10do_commandP3THD + 144
0x819e27a handle_one_connection + 646

from mysql.err:

thd->query at 0x8b91c60 = UPDATE systest1.tb1_eng1 target
SET f1 = @connection_id,
f2 = @operation,
f3 = ROUND(i1/@max_val,3),
f4 = @my_now
WHERE i1 IN (SELECT i1 FROM systest1.t1_tmp source WHERE source.i1 = target.i1)
thd->thread_id=106433
[12 Mar 2008 20:42] Omer Barnir
Observed a similar crash when testing 5.0.58 binaries. The crash was not reproduced in subsequent run
[24 May 2008 10:19] Philip Stoev
Bug still observed in 5.0.62, backtrace is:

#0  0x0000003ba880b132 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000000000571509 in handle_segfault ()
#2  <signal handler called>
#3  0x00000000005c9fb9 in test_if_ref ()
#4  0x00000000005ca229 in make_cond_for_table ()
#5  0x00000000005c5cb4 in make_join_select ()
#6  0x00000000005ba95a in JOIN::optimize ()
#7  0x00000000005474f9 in subselect_single_select_engine::exec ()
#8  0x0000000000542a98 in Item_subselect::exec ()
#9  0x000000000054338d in Item_singlerow_subselect::val_int ()
#10 0x00000000004f3619 in Item::save_in_field ()
#11 0x00000000005b1d1f in fill_record ()
#12 0x00000000005b05f6 in fill_record_n_invoke_before_triggers ()
#13 0x00000000005e6576 in mysql_update ()
#14 0x0000000000588327 in mysql_execute_command ()
#15 0x000000000058ddac in mysql_parse ()
#16 0x0000000000585f9e in dispatch_command ()
#17 0x00000000005918af in do_command ()
#18 0x0000000000585285 in handle_one_connection ()
#19 0x0000003ba88062f7 in start_thread () from /lib64/libpthread.so.0
#20 0x0000003ba80ce85d in clone () from /lib64/libc.so.6

query is 

 UPDATE systest1.tb1_eng1 target
SET i1 = (SELECT new_i1 FROM systest1.t1_tmp source WHERE source.i1 = target.i1),
f1 = @connection_id,
f2 = @operation,
f3 = ROUND(i1/@max_val,3),
f4 = @my_now
WHERE i1 = @next_val + 13 - @no_of_tries DIV 2

Crash happened after 3 hours 34 min.
[31 May 2008 12:15] MySQL Verification Team
can you post the table structures for the tables involved here?
[18 Jun 2008 8:42] Philip Stoev
All the queries that are crashing have in common a correlated subquery in an UPDATE and a temp table inside the subquery.

I tried to create a syntetic workload containing random queries matching those requirements, however I could not reproduce the crash.
[13 Aug 2008 13:13] MySQL Verification Team
Philip, if you run your attempted standalone test that didn't crash along with 100 'flush tables' per second, does that crash??
[3 Feb 2009 7:11] MySQL Verification Team
testcase on bug #42419