| Bug #11384 | drop database causes mysqld to core | ||
|---|---|---|---|
| Submitted: | 16 Jun 2005 14:11 | Modified: | 22 Jul 2005 12:32 |
| Reporter: | Tomas Ulin | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S3 (Non-critical) |
| Version: | 5.1-wl2325 | OS: | Linux (linux) |
| Assigned to: | Stewart Smith | CPU Architecture: | Any |
[16 Jun 2005 19:44]
Tomas Ulin
I also get it if I run mysql-test-run with: --skip-slave-binlog simplifies the debug printout on the slave...
[21 Jun 2005 2:33]
Stewart Smith
Verified with 5.1-wl2325 bk tree. Slave mysqld is what crashes.
[22 Jun 2005 5:34]
Stewart Smith
There is a bug in SUMA that crashes ndbd (and, seemingly, mysqld) when we're trying to unsubscribe from an already removed event. I have a patch that prevents the ndbd (and hence cluster) crash as well as mysqld from crashing. The bank test still doesn't seem to go too well, so there are probably other bugs (there are also still valgrind warnings). Since I am no expert on suma, i'm wanting to discuss the patch before checking anything in.
[27 Jun 2005 7:29]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/26431
[27 Jun 2005 12:44]
Stewart Smith
Tomas believe that ignoring the error is ignoring the symptom of a larger problem, and that if we do get this signal, it is because something is seriously wrong somewhere else (quite possibly corrupted in bad ways) and we really should not continue. Currently looking for the api source of the problem.
[6 Jul 2005 2:26]
Stewart Smith
Can no longer repeat with latest BK plus bk commit - 5.1 tree (stewart:1.1984) WL#2325 which seemed to fix things up on ppc.
[14 Jul 2005 7:35]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/27048
[22 Jul 2005 12:07]
Stewart Smith
Pushed to 4.0.26, 5.0.11. Can only reproduce with cluster replication though (5.1).
[22 Jul 2005 12:32]
Stewart Smith
Clarification: only the second patch was pushed. First one deemed inadequate.

Description: /home/tomas/mysql-5.1-wl2325/client/.libs/lt-mysqltest: At line 27: query 'DROP DATABASE BANK' failed: 2013: Lost connection to MySQL server during query core says: (gdb) where #0 0x401d8cb1 in kill () from /lib/libc.so.6 #1 0x4004c639 in pthread_kill () from /lib/libpthread.so.0 #2 0x0831674e in write_core (sig=11) at stacktrace.c:220 #3 0x081b9b4f in handle_segfault (sig=11) at mysqld.cc:2005 #4 0x4004ed69 in __pthread_clock_settime () from /lib/libpthread.so.0 #5 <signal handler called> #6 0x08204833 in remove_table_from_cache(THD*, char const*, char const*, bool) (thd=0x8771048, db=0x87803a4 "BANK", table_name=0x87803a9 "SYSTEM_VALUES", return_if_owned_by_thd=false) at sql_base.cc:4078 #7 0x081b43e3 in lock_table_name(THD*, st_table_list*) (thd=0x8771048, table_list=0x8780250) at lock.cc:583 #8 0x081b4610 in lock_table_names(THD*, st_table_list*) (thd=0x8771048, table_list=0x8780250) at lock.cc:658 #9 0x082cc1c9 in mysql_rm_table_part2(THD*, st_table_list*, bool, bool, bool, bool) (thd=0x8771048, tables=0x8780250, if_exists=true, drop_temporary=false, drop_view=true, dont_log_query=true) at sql_table.cc:219 #10 0x082cc0a2 in mysql_rm_table_part2_with_lock(THD*, st_table_list*, bool, bool, bool) (thd=0x8771048, tables=0x8780250, if_exists=true, drop_temporary=false, dont_log_query=true) at sql_table.cc:165 #11 0x082cafab in mysql_rm_known_files (thd=0x8771048, dirp=0x88045d8, db=0x8780248 "BANK", org_path=0x404f5104 "./BANK/", level=0) at sql_db.cc:826 #12 0x082ca624 in mysql_rm_db(THD*, char*, bool, bool) (thd=0x8771048, db=0x8780248 "BANK", if_exists=false, silent=false) at sql_db.cc:633 #13 0x081d5bdf in mysql_execute_command(THD*) (thd=0x8771048) at sql_parse.cc:3600 #14 0x081db039 in mysql_parse(THD*, char*, unsigned) (thd=0x8771048, inBuf=0x8780208 "DROP DATABASE BANK", length=18) at sql_parse.cc:5377 #15 0x081d093e in dispatch_command(enum_server_command, THD*, char*, unsigned) (command=COM_QUERY, thd=0x8771048, packet=0x877c1d1 "DROP DATABASE BANK", packet_length=19) at sql_parse.cc:1683 #16 0x081d0156 in do_command(THD*) (thd=0x8771048) at sql_parse.cc:1486 #17 0x081cf272 in handle_one_connection (arg=0x8771048) at sql_parse.cc:1135 #18 0x40049dc7 in pthread_detach () from /lib/libpthread.so.0 #19 0x40280aaa in clone () from /lib/libc.so.6 And the problem is uninitialized data in the in_use variable of the table struct: #6 0x08204833 in remove_table_from_cache(THD*, char const*, char const*, bool) (thd=0x8771048, db=0x87803a4 "BANK", table_name=0x87803a9 "SYSTEM_VALUES", return_if_owned_by_thd=false) at sql_base.cc:4078 4078 if (thd_table->db_stat) // If table is open (gdb) l 4073 */ 4074 for (TABLE *thd_table= in_use->open_tables; 4075 thd_table ; 4076 thd_table= thd_table->next) 4077 { 4078 if (thd_table->db_stat) // If table is open 4079 mysql_lock_abort_for_thread(thd, thd_table); 4080 } 4081 } 4082 else (gdb) p in_use->open_tables $1 = (TABLE *) 0x8f8f8f8f How to repeat: You have to work in a source tree and with ndb/test compiled killall -9 mysqld ndbd ndb_mgmd; ./mysql-test-run --fast --ndb-extra-test --do-test=rpl_ndb_bank --start-and-exit runtest < t/rpl_ndb_bank.test runtest < t/rpl_ndb_bank.test happens on initialization on the second run $ alias runtest alias runtest='MYSQL_DUMP='\''../client/mysqldump --no-defaults -uroot --socket=var/tmp/master.sock'\'' MYSQL_DUMP_SLAVE='\''../client/mysqldump --no-defaults -uroot --socket=var/tmp/slave.sock'\'' NDB_TOOLS_DIR=../storage/ndb/tools NDB_TOOLS_OUTPUT=`pwd`/var/log/ndb_tools.log NDB_BACKUP_DIR=`pwd`/var/ndbcluster-9350 NDBCLUSTER_PORT=9350 NDBCLUSTER_PORT_SLAVE=9358 MASTER_MYPORT=9306 MASTER_MYPORT1=9307 SLAVE_MYPORT=9308 NDB_EXTRA_TEST=1 NDB_STATUS_OK=1 NDB_MGM=../storage/ndb/src/mgmclient/ndb_mgm ../client/mysqltest -D test -u root --socket=var/tmp/master.sock' NOTE there are some changes you can do to make the testcase run faster but still get it...