Bug #23944 MysQL Crashes periodically
Submitted: 3 Nov 2006 14:39 Modified: 27 Jan 2008 8:08
Reporter: Kris Buytaert (Candidate Quality Contributor) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:5.0.27 OS:Linux (RHEL 4.4)
Assigned to: Jonas Oreland CPU Architecture:Any
Tags: ndbd crash index join

[3 Nov 2006 14:39] Kris Buytaert
Description:
MySQLd frequently crashes under higher load.  On 4 different nodes with similar , or even identical 

Probably when getting data back from ndbd nodes.

mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=805306368
read_buffer_size=258048
max_used_connections=73
max_connections=500
threads_connected=73
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 1040428 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x76ddbc98
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x764cd29c, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x819a960
0x8430d01
0x8430eab
0x825e658
0x825f71a
0x81f1cad
0x81f0ed0
0x81f0a87
0x81e1936
0x81e212e
0x81de100
0x81b0e19
0x81b85d0
0x81af43b
0x81aef6d
0x81ae4cf
0x97c371
0x8d4ffe
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0xaf07af0 = SELECT        link.id AS link_id,link.advertentie_id,link.campagne_id,
                                '0' AS netwerk_id,type,src,content_type
                FROM    link
                INNER JOIN advertentie
                        ON      advertentie.id = link.advertentie_id
                INNER JOIN campagne
                        ON      campagne.id = advertentie.campagne_id
                        AND campagne.status = 'default_campagne'
                        AND     campagne.id = 1
                WHERE   link.bannermaat_id = 1
                LIMIT 1
thd->thread_id=74
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

Number of processes running now: 0
Stack trace is identical on multiple nodes.

resolve_stack_dump -s 

[root@daisycon-sql03 mysql]# resolve_stack_dump -s mysqld-max.sym /tmp/stack
Stack range sanity check OK, backtrace follows:
0x819a960 handle_segfault + 416
0x8430d01 _ZN14NdbTransaction24getNdbIndexScanOperationEPK12NdbIndexImplPK12NdbTableImpl + 1130x8430eab _ZN14NdbTransaction24getNdbIndexScanOperationEPKN13NdbDictionary5IndexEPKNS0_5TableE + 590x825e658 _ZN13ha_ndbcluster18ordered_index_scanEPK12st_key_rangeS2_bbPc + 120
0x825f71a _ZN13ha_ndbcluster10index_readEPcPKcj16ha_rkey_function + 170
0x81f1cad _Z20join_read_always_keyP13st_join_table + 173
0x81f0ed0 _Z10sub_selectP4JOINP13st_join_tableb + 272
0x81f0a87 _Z9do_selectP4JOINP4ListI4ItemEP8st_tableP9Procedure + 295
0x81e1936 _ZN4JOIN4execEv + 4406
0x81e212e _Z12mysql_selectP3THDPPP4ItemP13st_table_listjR4ListIS1_ES2_jP8st_orderSB_S2_SB_mP13select_resultP18st_select_lex_unitP13st_sel + 2860x81de100 _Z13handle_selectP3THDP6st_lexP13select_resultm + 352
0x81b0e19 _Z21mysql_execute_commandP3THD + 681
0x81b85d0 _Z11mysql_parseP3THDPcj + 304
0x81af43b _Z16dispatch_command19enum_server_commandP3THDPcj + 1147
0x81aef6d _Z10do_commandP3THD + 141
0x81ae4cf handle_one_connection + 655
0x97c371 (?)
0x8d4ffe (?)

(1 mgm node, 4 ndb nodes, 4 sql nodes) 

How to repeat:
Run multiple queries as the one above on a mysql cluster.
[3 Nov 2006 22:18] Jonas Oreland
Hi,

can you upload your schema and some testdata?

/Jonas
[6 Nov 2006 10:03] Jonas Oreland
select in mysqltest-format

Attachment: pp.sql (text/x-sql), 775 bytes.

[6 Nov 2006 10:06] Jonas Oreland
Hi,

I imported dump and ran the following
for i in `seq 20`; do eval "mysqltest test < pp.sql &"; done

This I run on a 2-node cluster, with only 1 mysqld.
But the trace indicates a "local" bug. And I get very high cpu-load from this.

However, I get no error...

Can you upload your config.ini + my.cnf,
  so that I can see if I have some setting "wrong"

How frequent do you get error?

/Jonas
[6 Nov 2006 10:22] Kris Buytaert
Crash occurs frequently say once every 2-3 minutes under high load.
MySQL then restarts.
[6 Nov 2006 10:37] Jonas Oreland
thx...still no luck

Some thing puzzels me a bit
* ndb_index_stat_update_freq 
  this is not present in 5.0

* ndb_cache_check_time
  This is only used together with query cache, right
  Do you have any query cache settings also ?

/Jonas
[6 Nov 2006 10:37] Jonas Oreland
also,

did this problem start to occur suddently?

What did you change ?

/Jonas
[6 Nov 2006 10:45] Kris Buytaert
Correct my.cnf file

Attachment: my1.cnf (application/octet-stream, text), 475 bytes.

[6 Nov 2006 10:46] Kris Buytaert
Buffer configs are in the just uploaded my.cnf file.. which is the one for the mysql nodes.

We didn't change anything , except from adding a couple of indexes before the crashes started occuring.
[13 Nov 2006 9:24] Jonas Oreland
Hi,

1) Could you concider trying a debug build, and giving us core ?
2) You say that you added a few indexes, and then problem started occuring
   The problem is likely related to this (i.e some meta-data inconsistiency)

   How did you do with your mysqld's when adding the indexes
     (e.g singel user mode, stop/start, flush tables etc...)

3) Did you also drop any indexes ?

4) As a work-around, it might be possible to
   1) stop mysqld
   2) remove all .frm+.ndb -files from mysqld's datadirectory (for ndb-tables)
   3) start mysqld, run show tables or similar which will make it recreate
      the .frm+.ndb files

/Jonas
[13 Nov 2006 9:25] Jonas Oreland
Hi,

1) Could you concider trying a debug build, and giving us core ?
2) You say that you added a few indexes, and then problem started occuring
   The problem is likely related to this (i.e some meta-data inconsistiency)

   How did you do with your mysqld's when adding the indexes
     (e.g singel user mode, stop/start, flush tables etc...)

3) Did you also drop any indexes ?

4) As a work-around, it might be possible to
   1) stop mysqld
   2) remove all .frm+.ndb -files from mysqld's datadirectory (for ndb-tables)
   3) start mysqld, run show tables or similar which will make it recreate
      the .frm+.ndb files

/Jonas
[13 Nov 2006 11:25] Kris Buytaert
We've stepped the platform to 5.1.12 already.. Not sure if I can downgrade it back to try to reproduce the crashes.