MySQL Bugs: #18137: MySQL V5 crashing regularly with got signal 11

Bug #18137	MySQL V5 crashing regularly with got signal 11
Submitted:	10 Mar 2006 15:37	Modified:	7 Aug 2008 15:16
Reporter:	Dave Pullin (Basic Quality Contributor)	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server	Severity:	S1 (Critical)
Version:	5.0.45	OS:	Linux (Linux)
Assigned to:		CPU Architecture:	Any

Description:
This is not a good bug report -- I can't give you a reproducible instance. However other people are backing out of using V5 because of this instability (see mysql lists).  I have 3 servers crashing several times a day and its killing me. I want to help you find this problem. I need you to tell me what you need.

I am running MySQL on 6 servers - 3 Linux and 3 Windows. I recently upgraded to V5 on all servers. Now MySQL is crashing regularly (several times per day) with 'got signal 11'.

My 3 Linux servers are very different machines running different software 
a uniprocessor Pentium with 512MB running Redhat9 with MySQL 5.0.18-0.i386
, a new dual XEON with 8GB running Fedora Core 4 with 64bit MySQL 5.0.18-0.glibc23.x86_64  
, a old quad XEON with 4GB running Fedora Core 4 with MySQL 5.0.18-0.i386

The windows machines are not having a problem. All 6 are running essentially the same application.

It seems unlikely to be a hardware problem because its on 3 machines at once. It looks like a MySQL V5.

The 6 boxes are independent as far as mysql is concerned. 
They are not crashing at once. They are crashing independently but on the same day. It is very unlikely that the three independent servers all developed a real hardware problem at about the time I installed V5.

The servers are typically running 3 to 6 java threads that are using fairly demanding SQL queries and updates (often taking up to an hour to execute). Typically it is operating on tables with 10M-50M rows; sometimes as MERGES into billion+ row tables.

The OS looks stable when MySQL crashes. MySQL restarts and my application recovers except for application level confusion caused by the DB outage.
============================================================
Here's what one log said: (end of one restart to the next one - shows it took only 10 mins to crash)
060307 11:37:10  mysqld restarted
060307 11:37:10 [Warning] Asked for 196608 thread stack, but got 126976
060307 11:37:10 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.0.18-standard'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Edition - Standard (GPL)
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=134217728
read_buffer_size=126976
max_used_connections=2
max_connections=100
threads_connected=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 348271 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x86358b0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0xbff3ed18, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x80a23b7
0x82e9f48
0x830689f
0x8306733
0x82b26b6
0x82b2dac
0x80f10a7
0x80e3186
0x8173851
0x80e35da
0x80dfd41
0x80b4310
0x80bacba
0x80b28a3
0x80b2174
0x80b1694
0x82e76fc
0x831103a
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do 
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x86e7048 = select sourceName, sourcetype, tabletype ,'<font color='||if(coalesce(inflight,0)=1,'green','blue')||'>'||fStatus||'</font>' as fst ,fstatus as $fstatus  , count(*) as count ,sum(if(from_days(to_days( added_at ))=current_date,1,0)) as today$i ,sum(if(from_days(to_days( added_at ))=current_date-1,1,0)) as "day-1$i" ,sum(if(from_days(to_days( added_at ))=current_date-2,1,0)) as "day-2$i" ,sum(file_length) as size$i   from mirror.cl_control_jobs  group by sourceName, sourcetype, tabletype, fStatus, Inflight order by today$i desc,`day-1$i` desc,`day-2$i` desc
thd->thread_id=9
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

Number of processes running now: 0
060307 11:47:14  mysqld restarted
060307 11:47:14 [Warning] Asked for 196608 thread stack, but got 126976
060307 11:47:14 [Note] /usr/sbin/mysqld: ready for connections.

(BTW: the query select sourceName, sourcet ....etc is of a table with only about 30K rows.)
Here's the stack decoded:
[root@D1 mysql]# resolve_stack_dump -s /usr/lib/mysql/mysqld.sym -n mysqld.stack
0x80a23b7 handle_segfault + 423
0x82e9f48 pthread_sighandler + 184
0x830689f chunk_free + 303
0x8306733 free + 147
0x82b26b6 my_no_flags_free + 22
0x82b2dac free_root + 124
0x80f10a7 free_tmp_table__FP3THDP8st_table + 263
0x80e3186 destroy__4JOIN + 182
0x8173851 cleanup__13st_select_lex + 33
0x80e35da mysql_select__FP3THDPPP4ItemP13st_table_listUiRt4List1Z4ItemP4ItemUiP8st_orderT7T5T7UlP13select_resultP18st_select_lex_unitP13s + 986
0x80dfd41 handle_select__FP3THDP6st_lexP13select_resultUl + 193
0x80b4310 mysql_execute_command__FP3THD + 1328
0x80bacba mysql_parse__FP3THDPcUi + 282
0x80b28a3 dispatch_command__F19enum_server_commandP3THDPcUi + 1827
0x80b2174 do_command__FP3THD + 196
0x80b1694 handle_one_connection + 772
0x82e76fc pthread_start_thread + 220
0x831103a thread_start + 4

================================================
Here's another system -- its not crashing in the same place:

System:d4
[root@D4 mysql]# resolve_stack_dump -s /usr/lib/mysql/mysqld.sym -n mysqld.stack 
0x80a23b7 handle_segfault + 423
0x82e9f48 pthread_sighandler + 184
0x81630c9 init__17Query_cache_blockUl + 9
0x8165640 split_block__11Query_cacheP17Query_cache_blockUl + 32
0x81633e1 query_cache_end_of_result__FP3THD + 161
0x80bacc3 mysql_parse__FP3THDPcUi + 291
0x80b28a3 dispatch_command__F19enum_server_commandP3THDPcUi + 1827
0x80b2174 do_command__FP3THD + 196
0x80b1694 handle_one_connection + 772
0x82e76fc pthread_start_thread + 220
0x831103a thread_start + 4
=======================================
The third system hasn't give stack dump on any of its craches.
(This is the 64-bit server)
Number of processes running now: 0
060308 07:37:34  mysqld restarted
060308  7:37:36 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.0.18-standard'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Edition - Standard (GPL)
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=6442450944
read_buffer_size=126976
max_used_connections=20
max_connections=100
threads_connected=7
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 6508655 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Number of processes running now: 0
060308 11:28:08  mysqld restarted
060308 11:28:10 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.0.18-standard'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Edition - Standard (GPL)
===================================

I ran myisamchk on all tables on one of servers. It reported nothing unusual and it made no difference to the frequency of MySQL crashing.

How to repeat:
You can't. I can, unfortunately. Tell me what you need me to do.
I haven't found anything more in 
http://www.mysql.com/doc/en/Crashing.html
that I can do.

The same problem occurs with 5.0.19 -- crashing about every 2 to 3 hours.

I isolated a query that crashed 5.0.18 every time it executed. However it did not crash 5.0.19. ...  [let me know if you want the query + table (its 1MB)].

5.0.19 is crashing about every 2 hours. I will upload a query.log extract showing the 25 lines preceeding each, in case that helps you, but I can't see any pattern.

Maybe this is the same Bug here:
060306 10:55:56 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.0.18-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  Gentoo Linux mysql-5.0.18
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=16777216
read_buffer_size=258048
max_used_connections=4
max_connections=100
threads_connected=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 92783 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0xa9e00480
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0xac419dc8, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x816902f
0xb7de5541
0x48080f60
0x8128b01
0x81ce725
0x8268253
0x8268437
0x8265d69
0x8266bf5
0x818175e
0x8268549
0x8268276
0x8268437
0x8265d69
0x8266bf5
0x818175e
0x818480d
0x817a85c
0x817a398
0x8179859
0xb7ddfeea
0xb7c59cea
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x974a9a8 = select count(*)
      from RONSHOP_SEARCH_RESULT rsr
      left outer join RONSHOP_VIEW_RESIS_CACHE_STOCK rrcs
      on rsr.article_id=rrcs.article_id  and rsr.search_type_id=rrcs.search_type_id
      where rsr.search_type_id= NAME_CONST('p_search_type',6)
        and status!='valid'
    into i
    ;
thd->thread_id=45

Stack-Trace:
workplaces-v5 ~ # resolve_stack_dump -s mysqld.sym -n mysqld.stack
0x816902f _Z8map_filePcS_j + 299
0xb7de5541 _end + -1350680751
0x48080f60 _end + 1067972976
0x8128b01 _ZN16Item_func_strcmpD1Ev + 147
0x81ce725 _ZN14delayed_insertD0Ev + 313
0x8268253 _ZN7sp_head19create_result_fieldEjPKcP8st_table + 63
0x8268437 _Z21cmp_splocal_locationsPKP12Item_splocalS2_ + 415
0x8265d69 str_to_datetime + 981
0x8266bf5 number_to_datetime + 681
0x818175e _Z21mysql_execute_commandP3THD + 15318
0x8268549 _Z21cmp_splocal_locationsPKP12Item_splocalS2_ + 689
0x8268276 _ZN7sp_head19create_result_fieldEjPKcP8st_table + 98
0x8268437 _Z21cmp_splocal_locationsPKP12Item_splocalS2_ + 415
0x8265d69 str_to_datetime + 981
0x8266bf5 number_to_datetime + 681
0x818175e _Z21mysql_execute_commandP3THD + 15318
0x818480d _Z21mysql_execute_commandP3THD + 27781
0x817a85c _Z18free_max_user_connv + 12
0x817a398 _Z10check_userP3THD19enum_server_commandPKcjS3_b + 342
0x8179859 _ZN21sys_var_thd_ulonglong10check_typeE13enum_var_type + 9
0xb7ddfeea _end + -1350702854
0xb7c59cea _end + -1352300806

Florian,
This is not the place for this, but as the developers don't seem interested in helping, let me help you with what I have found. Last night I downgraded my 64 bit Mysql to the 'regular' i386 version 5.0.19.  That server didn't crash at all over night (instead of 47 crashes in the poevious 4 days). You might want to try it.

I have also isolated a query/table that invariably crashes 5.0.18 and that works ok on 5.0.19.
Dave

Thanks for this hint. We dont have a 64bit envoirement running, it is just a gentoo linux on a 32 bit pentium 4 machine. I followed your guide and installed a binary version. After the first start, and after the first query, the database got a signal 11 and did a crash recovery and restartet. than i stoped the database, restartet it and now it  seems to work, but i will have to make some stress tests before i can tell that it is working.

bad news from here:

060314 15:33:27 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
Version: '5.0.19-max-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Edition - Experimental (GPL)

mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=16777216
read_buffer_size=258048
max_used_connections=2
max_connections=100
threads_connected=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 92783 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x91fb4c8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0xb, backtrace may not be correct.
Bogus stack limit or frame pointer, fp=0xb, stack_bottom=0xac550000, thread_stack=196608, aborting backtrace.
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x93263c8 = select sum(price)  from (
  select ifnull(sum(price) * if(min(price)=0.0000,0,1),0.0000) as price
    from RONSHOP_DISPATCH_LIMITS
    where 
    (dispatch_limit_group_id,dispatch_type_id,ifnull(from_value,-1)
     ,ifnull(from_amount,-1),dispatch_country_id,currency_id)
    in
    (
      select rdl.dispatch_limit_group_id,rdl.dispatch_type_id
        ,ifnull(max(from_value),-1),ifnull(max(from_amount),-1),iCountryId,iCurrencyId
      from RONSHOP_DISPATCH_LIMITS rdl
      left outer join RONSHOP_VIEW_ORDER_DISPATCH_LIMIT_GROUP rvodlg
      on rvodlg.dispatch_limit_group_id=rdl.dispatch_limit_group_id
        and rdl.dispatch_type_id=rvodlg.dispatch_id
        and if(from_amount is NULL,from_value<=sum_price_net,from_amount<=sum_amount)
      where rvodlg.dispatch_id is NOT NULL
        and rdl.dispatch_country_id=iCountryId
        and rdl.currency_id=iCurrencyId
        and rvodlg.order_id=p_cartid
        group by rdl.dispatch_limit_group_id,rdl.dispatch_type_id
    ) group by dispatch_type_id) a
    into
thd->thread_id=87
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

Number of processes running now: 0
060314 15:38:52  mysqld restarted
060314 15:38:52  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
060314 15:38:52  InnoDB: Starting log scan based on checkpoint at
InnoDB: log sequence number 1 885057074.
InnoDB: Doing recovery: scanned up to log sequence number 1 885116619
060314 15:38:52  InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 849467, file name ./workplaces-v5-bin.000216
060314 15:38:53  InnoDB: Started; log sequence number 1 885116619
060314 15:38:53 [Note] Recovering after a crash using workplaces-v5-bin
060314 15:38:53 [Note] Starting crash recovery...
060314 15:38:53 [Note] Crash recovery finished.
060314 15:38:53 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
Version: '5.0.19-max-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Edition - Experimental (GPL)

Dave,

Thank you for a problem report. I am really sorry for some delay with its processing...

Let me summarize current situation. You had indentified the table with 1M of data and some query (or sequence of queries) that crash 64-bit 5.0.19 each and every time. 32-bit 5.0.19 works OK with it, though. Is it correct?

Please, send the uname -a result from your machine and exact version of glibc library used. Your my.cnf might be useful also. 

Can you also send the SHOW CREATE TABLE and SHOW TABLE STATUS results for the table used in the original report (those with resolved stack trace, on 5.0.18)?

Valeriy - Thank you for responding.

Your summary of the current situation is not quite accurate so I'll restate it:

1. I have had frequent crashes (every 95 mins or so) on 5.0.18 on both 32 and 64 bit servers.  

2. I have identified a table and a single query that invariably crashes 5.0.18 on a 32 bit server. It does not crash 5.0.19. (I dont know if the table crashed 5.0.18 on a 64 bit mysql because I had upgraded it to 5.0.19 by the time I found the table/query.)

3. 64-bit MySql 5.0.19 on a 64 bit server continues to crash every 95 mins or so, but I have had no luck identifying any particular query or sequence that causes it. I have provided you the you query log, in case it helps. This is my big problem.

4.I have found a work-around: using 32 bit Mysql 5.0.19 on my 64 bit server. It appears to be stable based on the last 12 hours without a crash.

5.My 32-bit servers are now running 5.0.19 and crashing less frequently, but I have not found a query/table that is causing it.

I will provide the data you asked for.

Thank you for the additional information. What exact MySQL binaries did you use? What exact glibc version do you have, in case they are not static?

On my 64-bit machine, which continues to have massive instability with
this binary MySQL-server-5.0.19-0.glibc23.x86_64.rpm
(ie statically linked).
>rpm -q glibc
glibc-2.3.5-10
glibc-2.3.5-10
glibc-2.3.5-10.3
glibc-2.3.5-10.3
>getconf GNU_LIBC_VERSION
glibc 2.3.5

On the 32-bit servers, I also use the statically bound glibc. binaries=
MySQL-server-5.0.18-0.i386.rpm
They also have glibc 2.3.5.

So i have 26 smp mysql servers.
Linux 2.6.16 - libc6 2.3.6-4   (debian) 
I tryied to compile mysql with gcc-3.4 or 4.0.3 - always bug...
I tryied to reduce drasticaly memory used by mysql but it does nothing... ( i  only have less cache hits :( ). I use the same config from 4.0.x (witch never bug ! ), 4.1.15 (bug sometimes with authentication)... to 5.0.19 => memory leak ?

Note than sometimes the mysqld hangs hup and mysqld_safe can launch another one and sometimes i need to kill it ...

I have that options to launch the server
--skip-new --skip-isam --skip-bdb --skip-innodb --skip-symbolic-links --skip-host-cache --skip-external-locking --skip-name-resolve   --default-storage-engine=myisam  --skip-locking

I have over 10 000 databases per computer.

Some stack traces.... on the file
So ask me if u want that i do more ....

I cant add a file...
So:

Machine 1:

mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=524288000
read_buffer_size=8384512
max_used_connections=20
max_connections=1000
threads_connected=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 114776 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x92dccbd0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x8ffdee18, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x815e19c
0xffffe420
(nil)
0x81cfd93
0x81761b5
0x817e124
0x817e762
Stack trace seems successful - bottom reached
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x8d0fd68 = INSERT DELAYED INTO nuke_blocked_pagetracker (last_page ,page_date ,id_tracker) VALUES ('/modules.php?name=Birthday', '1143604670', '12637')
thd->thread_id=18366
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

Number of processes running now: 0

machine 12:
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=524288000
read_buffer_size=8384512
max_used_connections=45
max_connections=1000
threads_connected=12
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 114776 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x8b52fdb8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x8aafae18, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x815e19c
0xffffe420
(nil)
0x81cfd93
0x81761b5
0x817e124
0x817e762
0x818037d
0xb7f45e60
0xb7d8688e
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x91d687d0 is invalid pointer
thd->thread_id=38482
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

Number of processes running now: 0

key_buffer_size=524288000
read_buffer_size=8384512
max_used_connections=11
max_connections=1000
threads_connected=8
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 114776 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x92681540
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x940eee18, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x815e19c
0xffffe420
(nil)
0x81cfd93
0x81761b5
0x817e124
0x817e762
Stack trace seems successful - bottom reached
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x92780448 is invalid pointer
thd->thread_id=396
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

Number of processes running now: 0

Machine 20:

key_buffer_size=524288000
read_buffer_size=2093056
max_used_connections=22
max_connections=400
threads_connected=6
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 2148796 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x97141730
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x960fea48, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x8166358
0xffffe420
0x82d99ac
0x8189804
0xb7edbced
0xb7d1cd7e
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x97802a0 = SELECT
a.filename, a.attachmentType, a.ID_ATTACH, a.ID_MEMBER, m.ID_MSG,
IFNULL(thumb.ID_ATTACH, 0) AS ID_THUMB, thumb.filename AS thumb_filename, thumb_parent.ID_ATTACH AS ID_PARENT
FROM (smf_attachments AS a, smf_messages AS m)
LEFT JOIN smf_attachments AS thumb ON (thumb.ID_ATTACH = a.ID_THUMB)
LEFT JOIN smf_attachments AS thumb_parent ON (a.attachmentType = 3 AND thumb_parent.ID_THUMB = a.ID_ATTACH)
WHERE a.attachmentType = 0 AND m.ID_TOPIC = 21
AND m.ID_MSG = a.ID_MSG
thd->thread_id=120396
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

Number of processes running now: 0

All reporters: 

Have you installed a fresh copy of MySQL 5.0.x and loaded data from the dump? Or just used tables created in 4.x.y version of MySQL? To put it simple, have you read and followed the manual on upgrade procedures/scripts etc?

Yohan:

This is the most important question to you, especially. I had found the following:

thd->query at 0x8d0fd68 = INSERT DELAYED INTO nuke_blocked_pagetracker
(last_page ,page_date ,id_tracker) VALUES ('/modules.php?name=Birthday',
'1143604670', '12637')

in you quotes from the error log. INSERT DELAYED into tables with VARCHAR fileds almost surely will lead to crashes, if you just used tables from older versions of MySQL. It is a well-known bug.

I only load the data from older tables.
I manage something like 250Go of databases over 26 servers.
I can't dump or load anything...

"INSERT DELAYED into tables with VARCHAR fileds almost surely will lead to crashes, if you just used tables from older versions of MySQL. It is a well-known bug"

Why ? 
It will be corrected ?

Crash of INSERT DELAYED should be corrected since 5.0.16. See bug #13707 for the details. Check if the crash is repeatable again as described there and add a comment to that bug report!

Recommended upgrade procedure is described in http://dev.mysql.com/doc/refman/5.0/en/upgrading-from-4-1.html. 

Read also this page, if you are on 5.0.19 (http://dev.mysql.com/doc/refman/5.0/en/mysqlcheck.html):

"--check-upgrade, -g

Invoke CHECK TABLE with the FOR UPGRADE option to check tables for incompatibilities with the current version of the server. This option was added in MySQL 5.0.19."

This is what you shell do if dump and restore is not an option!

I tryied to reproduce the insert delayed bug and to use mysqlcheck on some tables, but nothing special appends. All was ok.

Something else produce my sig11 / sig6 bug...

I too am experiencing this problem on RHEL3 with kernel version 2.4.21-40. I have duplicated this on another server running 5.0.19, both a custom build and the i386 RPM (well just the server RPM) from mysql.com. The symptoms are the same in regards to the signal 11:

060408 1:03:42 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.0.19-standard-log' socket: '/home/virtual/FILESYSTEMTEMPLATE/.mysqlsock/mysql.sock' port: 3306 MySQL Community Edition - Standard (GPL)
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8388608
read_buffer_size=2093056
max_used_connections=3
max_connections=80
threads_connected=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 335551 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x9165e40
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x86932c, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x815d800
0x572f88
0x8171443
0x817a1e8
0x8171443
0x8170f7d
0x81704c0
0x56cdd8
0xbb5d1a
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x9177fa8 = SELECT a.filename, a.ID_ATTACH, a.ID_MEMBER, m.ID_MSG, IFNULL(thumb.ID_ATTACH, 0) AS ID_THUMB, thumb.filename AS thumb_filename, thumb_parent.ID_ATTACH AS ID_PARENT FROM (sm_attachments AS a, sm_messages AS m) LEFT JOIN sm_attachments AS thumb ON (thumb.ID_ATTACH = a.ID_THUMB) LEFT JOIN sm_attachments AS thumb_parent ON (a.attachmentType = 3 AND thumb_parent.ID_THUMB = a.ID_ATTACH) WHERE a.attachmentType = 0 AND m.ID_TOPIC = 30 AND m.ID_MSG = a.ID_MSG
thd->thread_id=58
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

I have dumped the database and reimported it as well. From some tests though, I know that if I reduce the number of fields selected down to 1, it works fine. Regardless of which field selected, if I do more than 1, the server will crash.

I tried 5.0.21 from BitKeeper, still the same errors:

Version: '5.0.21-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  Source distribution
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=16777216
read_buffer_size=258048
max_used_connections=13
max_connections=100
threads_connected=6
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 92783 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0xaa718590
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0xac395968, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x816f42f
0xb7e43541
(nil)
0x8129978
0x81d91d2
0x827a9fc
0x827abe3
0x8278131
0x8278749
0x8117625
0x81174cf
0x811c3db
0x80fa0c2
0x8162c70
0x81b86ae
0x81b9e6a
0x81b5f25
0x8183a6c
0x81da898
0x81d935c
0x818317b
0x8181d0d
0x8181082
0xb7e3deea
0xb7cb7cea
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do 
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0xa8dbc7c8  is invalid pointer
thd->thread_id=1109
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

florianen@workplaces-v5 ~/mysql $ resolve_stack_dump -s mysqld.sym -n mysqld.stack
0x816f42f handle_segfault + 703
0xb7e43541 _end + -1350023471
(nil)
0x8129978 _ZN13Item_cond_and20copy_andor_structureEP3THD + 120
0x81d91d2 _Z22reinit_stmt_before_useP3THDP6st_lex + 498
0x827a9fc _ZN13sp_lex_keeper23reset_lex_and_exec_coreEP3THDPjbP8sp_instr + 156
0x827abe3 _ZN13sp_instr_stmt7executeEP3THDPj + 211
0x8278131 _ZN7sp_head7executeEP3THD + 833
0x8278749 _ZN7sp_head16execute_functionEP3THDPP4ItemjP5Field + 633
0x8117625 _ZN12Item_func_sp12execute_implEP3THDP5Field + 133
0x81174cf _ZN12Item_func_sp7executeEPP5Field + 63
0x811c3db _ZN12Item_func_sp8val_realEv + 27
0x80fa0c2 _ZN4Item4sendEP8ProtocolP6String + 370
0x8162c70 _ZN11select_send9send_dataER4ListI4ItemE + 192
0x81b86ae _ZN4JOIN4execEv + 622
0x81b9e6a _Z12mysql_selectP3THDPPP4ItemP13st_table_listjR4ListIS1_ES2_jP8st_orderSB_S2_SB_mP13select_res              ultP18st_select_lex_unitP13st_sel + 202
0x81b5f25 _Z13handle_selectP3THDP6st_lexP13select_resultm + 309
0x8183a6c _Z21mysql_execute_commandP3THD + 636
0x81da898 _ZN18Prepared_statement7executeEP6Stringb + 616
0x81d935c _Z18mysql_stmt_executeP3THDPcj + 332
0x818317b _Z16dispatch_command19enum_server_commandP3THDPcj + 5147
0x8181d0d _Z10do_commandP3THD + 141
0x8181082 handle_one_connection + 434
0xb7e3deea _end + -1350045574
0xb7cb7cea _end + -1351643526

I made another Bug-Report at:
http://bugs.mysql.com/bug.php?id=18311
Whicht looks to me as the identical issue.

Dave:

Can you, please, upload a dump of that smallest `crashing` table you described in private comment? There is no data in that comment. 

Have you tried to repeat with a newer version, 5.0.20a?

Other reporters:

Exact dumps of your tables or other way to repeat on 5.0.20a are velcomed (with similar queries). If query/stack trace is very different, open new bug reports.

Valeriy,
I could not get the smallest table down to the size limit for uploads. The smallest is about 1MB.
(The crash went away if I deleted more rows or if I nulled some larger columns).

Reminder: the only repeatably crashing table crashed on 5.0.18. 

I will re-try with the latest release.

Hi,

I also get the signal 11 error, but it occurs on my server exactly one time a day, and every day at the same time. I checked everything, and there is no process/script/program that is executed by the system at that time. The server is Running Debian Sarge unstable with (at the moment) MySQL 5.0.20a-Debian_2. I went through several version of MySQL (hope) but no solution so far.

Some parts of my syslog:

Apr 29 06:25:33 localhost mysqld[17128]: mysqld got signal 11;
Apr 29 06:25:33 localhost mysqld[17128]: This could be because you hit a bug. It is also possible that this binary
Apr 29 06:25:33 localhost mysqld[17128]: or one of the libraries it was linked against is corrupt, improperly built,
Apr 29 06:25:33 localhost mysqld[17128]: or misconfigured. This error can also be caused by malfunctioning hardware.
Apr 29 06:25:33 localhost mysqld[17128]: We will try our best to scrape up some info that will hopefully help diagnose
Apr 29 06:25:33 localhost mysqld[17128]: the problem, but since we have already crashed, something is definitely wrong
Apr 29 06:25:33 localhost mysqld[17128]: and this may fail.
Apr 29 06:25:33 localhost mysqld[17128]: 
Apr 29 06:25:33 localhost mysqld[17128]: key_buffer_size=536870912
Apr 29 06:25:33 localhost mysqld[17128]: read_buffer_size=131072
Apr 29 06:25:33 localhost mysqld[17128]: max_used_connections=101
Apr 29 06:25:33 localhost mysqld[17128]: max_connections=250
Apr 29 06:25:33 localhost mysqld[17128]: threads_connected=5
Apr 29 06:25:33 localhost mysqld[17128]: It is possible that mysqld could use up to 
Apr 29 06:25:33 localhost mysqld[17128]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 1068286 K
Apr 29 06:25:33 localhost mysqld[17128]: bytes of memory
Apr 29 06:25:33 localhost mysqld[17128]: Hope that's ok; if not, decrease some variables in the equation.
Apr 29 06:25:33 localhost mysqld[17128]: 
Apr 29 06:25:34 localhost mysqld_safe[19894]: Number of processes running now: 0
Apr 29 06:25:34 localhost mysqld_safe[19896]: restarted
Apr 29 06:25:34 localhost mysqld[19900]: 060429  6:25:34  InnoDB: Started; log sequence number 0 43685
Apr 29 06:25:34 localhost mysqld[19900]: 060429  6:25:34 [Warning] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=boincstats-relay-bin' to avoid this problem.
Apr 29 06:25:34 localhost mysqld[19900]: 060429  6:25:34 [Note] /usr/sbin/mysqld: ready for connections.
Apr 29 06:25:34 localhost mysqld[19900]: Version: '5.0.20a-Debian_1'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  Debian Etch distribution
Apr 29 06:25:34 localhost mysqld[19900]: 060429  6:25:34 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.010683' at position 98, relay log './boincstats-relay-bin.000040' position: 235
Apr 29 06:25:34 localhost mysqld[19900]: 060429  6:25:34 [Note] Slave I/O thread: connected to master 'repl@217.67.229.236:3306',  replication started in log 'mysql-bin.010683' at position 98
Apr 30 06:26:18 localhost mysqld[19900]: mysqld got signal 11;
Apr 30 06:26:18 localhost mysqld[19900]: This could be because you hit a bug. It is also possible that this binary
Apr 30 06:26:18 localhost mysqld[19900]: or one of the libraries it was linked against is corrupt, improperly built,
Apr 30 06:26:18 localhost mysqld[19900]: or misconfigured. This error can also be caused by malfunctioning hardware.
Apr 30 06:26:18 localhost mysqld[19900]: We will try our best to scrape up some info that will hopefully help diagnose
Apr 30 06:26:18 localhost mysqld[19900]: the problem, but since we have already crashed, something is definitely wrong
Apr 30 06:26:18 localhost mysqld[19900]: and this may fail.
Apr 30 06:26:18 localhost mysqld[19900]: 
Apr 30 06:26:18 localhost mysqld[19900]: key_buffer_size=536870912
Apr 30 06:26:18 localhost mysqld[19900]: read_buffer_size=131072
Apr 30 06:26:18 localhost mysqld[19900]: max_used_connections=72
Apr 30 06:26:18 localhost mysqld[19900]: max_connections=250
Apr 30 06:26:18 localhost mysqld[19900]: threads_connected=2
Apr 30 06:26:18 localhost mysqld[19900]: It is possible that mysqld could use up to 
Apr 30 06:26:18 localhost mysqld[19900]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 1068286 K
Apr 30 06:26:18 localhost mysqld[19900]: bytes of memory
Apr 30 06:26:18 localhost mysqld[19900]: Hope that's ok; if not, decrease some variables in the equation.
Apr 30 06:26:18 localhost mysqld[19900]: 
Apr 30 06:26:18 localhost mysqld_safe[5518]: Number of processes running now: 0
Apr 30 06:26:18 localhost mysqld_safe[5520]: restarted
Apr 30 06:26:19 localhost mysqld[5526]: 060430  6:26:19  InnoDB: Started; log sequence number 0 43685
Apr 30 06:26:19 localhost mysqld[5526]: 060430  6:26:19 [Warning] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=boincstats-relay-bin' to avoid this problem.
Apr 30 06:26:19 localhost mysqld[5526]: 060430  6:26:19 [Note] /usr/sbin/mysqld: ready for connections.
Apr 30 06:26:19 localhost mysqld[5526]: Version: '5.0.20a-Debian_1'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  Debian Etch distribution
Apr 30 06:26:19 localhost mysqld[5526]: 060430  6:26:19 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.010684' at position 98, relay log './boincstats-relay-bin.000044' position: 235
Apr 30 06:26:19 localhost mysqld[5526]: 060430  6:26:19 [Note] Slave I/O thread: connected to master 'repl@217.67.229.236:3306',  replication started in log 'mysql-bin.010684' at position 98

Dave,

You may upload your large file(s) to ftp://ftp.mysql.com/pub/mysql/upload/, with bug # (18137) in the filename(s). Add a comment when you'll do it.

Have you tried to repeat on the latest version, 5.0.21?

I have uploaded bug # (18137).zip, which contains the table. I'm not sure you should spend too much time on that table, It was charshing repeatably on 5.0.18 but not since.

The unstability that I need to check on 5.0.21 occurs on the 64bit SMP system only, and my 64bit system has hardware problems at the moment, so I haven't been able to try it yet, sorry.
Dave

I have some news:


with 5.0.22-nightly-20060504 (i tryied that ! less crashs.. )

table:

CREATE TABLE `phpwebgallery_image_category` (
  `image_id` mediumint(8) unsigned NOT NULL default '0',
  `category_id` smallint(5) unsigned NOT NULL default '0',
  PRIMARY KEY  (`image_id`,`category_id`),
  KEY `category_id` (`category_id`),
  KEY `image_id` (`image_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci 

with :

mysql> select * from phpwebgallery_image_category ;
+----------+-------------+
| image_id | category_id |
+----------+-------------+
|        1 |           2 |
|        2 |           2 |
|        3 |           2 |
|        4 |           2 |
|        5 |           2 |
|        6 |           2 |
|        7 |           2 |
+----------+-------------+
7 rows in set (0.02 sec)

this request crashes the mysql:

select image_id from toto where category_id not in ( 5, -1  );

it seems that's the -1 ...
not in ( -1 ) works but with some other values it crashes

Same thing with 5.0.22-nightly-20060512-debug...

Tordjman,

Why do you think your crash is releated to the original report? If it is repeatable, please, report it as a separate bug.

SIG11. same error.
Ok i do a new "bug".

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

I have now isolated the problem to precise statements, but I still can't give you a small reproducible example.

If I issue
  select count(*) from db1.t1
  select count(*) from db2.t2

repeatedly in reasonably rapid succession, the server hangs, but the kicker is that t1 and t2 are both large MERGE tables -- 50 to 250 merged tables, with 250,000,000 to 750,000,000 rows in the merge.

"rapidly" means, for example, as a one line statement in the mysql console:

select count(*) from db1.t1; select count(*) from db2.t2;select count(*) from db1.t1; select count(*) from db2.t2;select count(*) from db1.t1; select count(*) from db2.t2;

But also slower such as issuing the selects from java as distinct statements, even from a remote client, so the timing is not especially critical.

"repeatedly" means: between ONCE (it hangs on the second select) and several hundred times. The bug seems somewhat intermittent.

I am now using 5.0.45 and so I have updated the bug's version. The details of teh system failing are:
uname -a
Linux D4 2.6.11-1.1369_FC4smp #1 SMP Thu Jun 2 23:08:39 EDT 2005 i686 i686 i386 GNU/Linux
Distribution
Fedora Core release 4 (Stentz)
GNU C Library development release version 2.3.5, by Roland McGrath et al.
GNU_LIBC_VERSION
glibc 2.3.5
rpm -q glibc
glibc-2.3.5-10
rpm -q MySQL-server
MySQL-server-5.0.45-0

When it hangs, the server appears to be in a tight loop. A single mysql process is using 100% of the CPU (or 100% of 1 CPU on a 4 CPU system).. 
If I have a mysql console open it can do "show processlist;". It shows many processes all with high values for "time", 
incluidng trivial selects such as "SELECT 1".  See example below.

+----+--------+-----------------+-----------+---------+------+-------+------------------------------------------------------------------------------------------------------+
| Id | User   | Host            | db        | Command | Time | State | Info                                                                                                 |
+----+--------+-----------------+-----------+---------+------+-------+------------------------------------------------------------------------------------------------------+
|  1 | tomcat | localhost:33549 | coldlogic | Query   |   61 | NULL  | select count(*) from lcr_titletotals.master                                                          | 
|  2 | tomcat | localhost:33550 | coldlogic | Query   |   53 | NULL  | select value from properties where name='autostart_wait' and system in('','coldlogic@d4') order by i | 
|  3 | root   | localhost       | NULL      | Query   |   17 | NULL  | select 1                                                                                             | 
|  4 | root   | localhost       | NULL      | Query   |    0 | NULL  | show processlist                                                                                     | 
+----+--------+-----------------+-----------+---------+------+-------+------------------------------------------------------------------------------------------------------+

If I attempt to start a mysql console it locks after "Welcome to the MySQL monitor.  Commands end with ; or \g." and before it writes ">".
Show processlist on the working console suggests it is locking on "select @@version_comment limit 1".

If I "kill -11 pid" on the PID of the 100% busy process, I get a stack trace in the .err file

The trace back is:

0x80a819e handle_segfault + 510
0x8367d08 pthread_sighandler + 184
0x8334c67 movelink + 23
0x8334fa1 my_hash_insert + 673
0x81881d1 insert_table__11Query_cacheUiPcP23Query_cache_block_tableUiUcPFP3THDPcUiPUx_cUx + 193
0x818808c register_tables_from_list__11Query_cacheP13st_table_listUiP23Query_cache_block_table + 492
0x81880fc register_all_tables__11Query_cacheP17Query_cache_blockP13st_table_listUi + 28
0x8186499 store_query__11Query_cacheP3THDP13st_table_list + 649
0x80bb225 mysql_execute_command__FP3THD + 1493
0x80c180d mysql_parse__FP3THDPCcUiPPCc + 253
0x80b9436 dispatch_command__F19enum_server_commandP3THDPcUi + 1894
0x80b8cc3 do_command__FP3THD + 211
0x80b8195 handle_one_connection + 965
0x83654bc pthread_start_thread + 220
0x838f99a thread_start + 4

here is my.cnf

[mysqld]
########## server specific - depends on memory and file layout
socket		= /var/lib/mysql/mysql.sock

# key_buffer memory for D4 (4GB of memory, 4 processors)
key_buffer=2048M
max_allowed_packet=4M
table_cache=512
sort_buffer=2M
query_cache_size=10M
thread_cache=8
thread_concurrency=8
myisam_sort_buffer_size=20M
low_priority_updates=1
log=query.log
datadir=/data/mysql
tmpdir=/data/mysql/tmpdir
## added 
lower_case_table_names=1
#
sql-mode=PIPES_AS_CONCAT
group_concat_max_len = 65535
max_heap_table_size=200M
open_files_limit=3000

port=3306
#socket=MySQL
skip-locking
skip-innodb
skip-bdb     

record_buffer=128K
read_rnd_buffer_size=512K

net_buffer_length=8K

Because it is failing on three servers, it unlikely be corrupted data or binaries or a hardware problem. (although I done lots of CHECK and REPAIR of te tables, and I have refreshed the binaries -- makes no difference).

I am hoping that this additional information will allow you to make some progress to diagnose the issue, since it is crashing all my servers, all the time, and it has taken months of work to isolate it this far.

I have worked around the problem by removing 'select count(*) from .. large MERGE tables' and the regular mysql server crashing/hanging has stopped, confirming that the crashing was caused by those selects.

Please, send the EXPLAIN results for this problematic SELECT.

You asked for the EXPLAIN .. it is:

Id	Select Type	Table	Type	Possible Keys	Key	Key Len	Ref	Rows	Extra
1	SIMPLE								Select tables optimized away

Recall that the SELECT is "SELECT count(*) from TABLE"
where TABLE is a huge MERGE table.

Dave, I might have the same problem as you have.
Unfortunately your workaround (basically by reducing merge table usage)
cannot be applied in general in our case since external customers
do use the merge tables directly (for reporting).

I have also a bug open for this case, it might be the same 
underlying problem as you seem to have, feel free to have a look:

http://bugs.mysql.com/bug.php?id=33362

Bjorn, I did not really reduce the use of MERGE tables - I make huge use of huge merge tables. The only thing I did was replace 'select count(*)' of the merge table. I maintain a control table with a row per table in the merge and it includes the row count, so I replaced the SELECT count(*) FROM MERGE with SELECT sum(row_count) FROM MERGE_CONTROL_TABLE
 I can do this because in my environment I control all the SQL. My users only get the results!
 The really hard part of this bug is isolating the problem.

Please, try to repeat with a newer version, 5.0.51a, and inform about the results.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

We still need to know if the issues are fixed by using our newest version (5.0.51a).

This bug is hard for me to test because it requires huge tables that I only have on a production system and the bug crashes the production system. Not something I do lightly. My work around is keeping the production system going OR hiding the bug if it is still there. I'll test it when a safe window arises.

Dave,

take your time. I just will set this back to "need feedback". It will switch to "no feedback" after one months by automatism. Then we will ask back and set it to "need feedback" again.

We just need to know if you have this problem with newer version or not. For us it doesn't matter if you will test it this month or next month. But when you test it, please take our newest versions.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

I can no longer reproduce this problem. I cant reproduce it on 5.0.45. This may be because it has use production data which has grown in size. Since I can't reproduce it on 5.0.45 I can't prove anything about whether a later version. Sorry, this is (or was) a few annoying bug.