Bug #29018 | mysqld crash while backing up ndb cluster tables | ||
---|---|---|---|
Submitted: | 11 Jun 2007 14:43 | Modified: | 20 Sep 2007 5:07 |
Reporter: | Steven Cain | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Server: Backup | Severity: | S2 (Serious) |
Version: | 5.0.45 | OS: | Linux (Fedora Core 6) |
Assigned to: | CPU Architecture: | Any |
[11 Jun 2007 14:43]
Steven Cain
[20 Jun 2007 20:37]
Hartmut Holzgraefe
Can you provide us with the mysqld.sym file from your installation or tell us the exact MySQL-server-5.0.41-0 version you are using (either full platform specs or just the exact download URL) so that we can resolve the stack trace?
[20 Jun 2007 21:08]
Steven Cain
Here's the link to the server's rpm: http://dev.mysql.com/get/Downloads/MySQL-5.0/MySQL-server-5.0.41-0.i386.rpm/from/ftp://www... I downloaded the rpms from the 5.0 community server "Linux x86 generic RPM (statically linked against glibc 2.2.5) downloads"
[20 Jun 2007 21:15]
Steven Cain
Uploaded bug-data-29018.zip to ftp.mysql.com/pub/mysql/upload/ with the mysqld.sym that was requested.
[16 Jul 2007 19:10]
Steven Cain
Has any progress been made? I haven't seen any updates in 3 weeks.
[19 Jul 2007 14:12]
Steven Cain
I have upgraded to 5.0.45 and the problem is still happening. I have uploaded bug-data-29018-2.zip that contains the new mysqld.sym. Here is the new stack trace: 070718 21:00:02 - mysqld got signal 11; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=402653184 read_buffer_size=2093056 max_used_connections=42 max_connections=1000 threads_connected=28 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 290904 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. thd=0xa1cba50 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... Cannot determine thread, fp=0xbf77e278, backtrace may not be correct. Stack range sanity check OK, backtrace follows: 0x80a819e 0x8367d08 0x82a92d7 0x815cec1 0x817c7d2 0x817c158 0x8181afc 0x80ec4c5 0x80edb40 0x80e9484 0x80bb238 0x80c180d 0x80b9436 0x80b8cc3 0x80b8195 0x83654bc 0x838f99a New value of fp=(nil) failed sanity check, terminating stack trace! Please read http://dev.mysql.com/doc/mysql/en/using-stack-trace.html and follow instructions on how to resolve the stack trace. Resolved stack trace is much more helpful in diagnosing the problem, so please do resolve it Trying to get some variables. Some pointers may be invalid and cause the dump to abort... thd->query at 0xa4ec5f8 = show table status like 'CounterDefinitions' thd->thread_id=68484 The manual page at http://www.mysql.com/doc/en/Crashing.html contains information that should help you find out what is causing the crash. Number of processes running now: 0 070718 21:00:02 mysqld restarted 070718 21:00:02 [Warning] Asked for 196608 thread stack, but got 126976
[30 Jul 2007 21:28]
Steven Cain
Since installing the dynamic glibc version I can get the stack dump: # resolve_stack_dump -s /usr/lib/mysql/mysqld.sym -n temp.txt 0x819aae9 handle_segfault + 521 0x83b07d3 _ZN3Ndb22readAutoIncrementValueEPKN13NdbDictionary5TableERy + 35 0x826412e _ZN13ha_ndbcluster4infoEj + 222 0x82844af _Z24get_schema_tables_recordP3THDP13st_table_listP8st_tablebPKcS6_ + 591 0x8283cc1 _Z14get_all_tablesP3THDP13st_table_listP4Item + 1585 0x82897d1 _Z24get_schema_tables_resultP4JOIN23enum_schema_table_state + 385 0x81e4fbd _ZN4JOIN4execEv + 6621 0x81e530d _Z12mysql_selectP3THDPPP4ItemP13st_table_listjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_sel + 493 0x81e07a9 _Z13handle_selectP3THDP6st_lexP13select_resultm + 377 0x81b1749 _Z21mysql_execute_commandP3THD + 713 0x81b8ae5 _Z11mysql_parseP3THDPKcjPS2_ + 261 0x81afde5 _Z16dispatch_command19enum_server_commandP3THDPcj + 1365 0x81af83e _Z10do_commandP3THD + 158 0x81aec66 handle_one_connection + 726 0x45f3db (?) 0x3a426e (?)
[3 Aug 2007 16:43]
Steven Cain
I added some debugging and built a version of mysqld and I found where the crash is occuring: nbd/src/ndbapi/Ndb.cpp The 'table' pointer is zero after the call to getImpl and it is not validated before its use in get_local_table_info. int Ndb::readAutoIncrementValue(const NdbDictionary::Table * aTable, Uint64 & tupleId) { print_time(stderr); fprintf(stderr,"Ndb::readAutoIncrementValue 01\n"); const NdbTableImpl* table = & NdbTableImpl::getImpl(*aTable); const BaseString& internal_tabname = table->m_internalName; Ndb_local_table_info *info= theDictionary->get_local_table_info(internal_tabname, false);
[13 Aug 2007 22:00]
Steven Cain
The crash that I am experiencing with mysqld in readAutoIncrementValue is not always due to a null pointer; sometimes the pointer is not null but still causes the crash. I have seen crashed where the pointer is null, 2, or a seemingly normal value. I built a debug version of mysqld and turned tracing on to find out where readAutoIncrementValue was being called. readAutoIncrementValue is being called from get_schema_tables_record and immediately after open_normal_and_derived_tables. I suspect there is something wrong with the cached tables. I now perform a flush tables before my backup script runs and I have now gone four days without a crash.
[20 Aug 2007 5:07]
Stewart Smith
I think this is BUG#26793, which is fixed in the 5.0-ndb tree. Please retest with 5.0-ndb tree.
[20 Sep 2007 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".