Bug #28123 | rpl_ndb_mix_innodb.test casue slave to core on sol10-sparc-a | ||
---|---|---|---|
Submitted: | 26 Apr 2007 14:26 | Modified: | 10 Jul 2007 7:17 |
Reporter: | Jonathan Miller | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Cluster: Replication | Severity: | S1 (Critical) |
Version: | 5.1-telco | OS: | Solaris (sol10-sparc-a) |
Assigned to: | Assigned Account | CPU Architecture: | Any |
[26 Apr 2007 14:26]
Jonathan Miller
[2 Jul 2007 6:48]
Guangbao Ni
Loaded symbols for /usr/platform/SUNW,Sun-Fire-V210/lib/libc_psr.so.1 #0 0xff11fc54 in _lwp_kill () from /usr/lib/libc.so.1 (gdb) bt #0 0xff11fc54 in _lwp_kill () from /usr/lib/libc.so.1 #1 0x00339730 in write_core (sig=11) at stacktrace.c:229 #2 0x001b69c0 in handle_segfault (sig=11) at mysqld.cc:2239 #3 0xff355bb4 in __sighndlr () from /usr/lib/libthread.so.1 #4 0xff34f80c in call_user_handler () from /usr/lib/libthread.so.1 #5 <signal handler called> #6 0xff0b455c in strlen () from /usr/lib/libc.so.1 #7 0xff107058 in _doprnt () from /usr/lib/libc.so.1 #8 0xff108ab4 in fprintf () from /usr/lib/libc.so.1 #9 0x002b2190 in ndbcluster_commit (hton=0x0, thd=0xffffffff, all=false) at ha_ndbcluster.cc:5152 #10 0x0029bff8 in ha_commit_one_phase(THD*, bool) (thd=0xb969e0, all=false) at handler.cc:789 #11 0x0029c5a4 in ha_commit_trans(THD*, bool) (thd=0xb969e0, all=false) at handler.cc:755 #12 0x0029c968 in ha_autocommit_or_rollback(THD*, int) (thd=0xb969e0, error=0) at handler.cc:899 #13 0x00273594 in Rows_log_event::do_update_pos(st_relay_log_info*) ( ---Type <return> to continue, or q <return> to quit--- this=0xc93fc8, rli=0xb1d118) at log_event.cc:6176 #14 0x0032508c in exec_relay_log_event (thd=0xb969e0, rli=0xb1d118) at log_event.h:828 #15 0x00325aa4 in handle_slave_sql (arg=0xb1d118) at slave.cc:2407
[3 Jul 2007 7:03]
Guangbao Ni
in slave.err file, there are the following error messages: 070629 15:34:18 [ERROR] Slave: Error in Update_rows event: commit of row events failed, Error_code: 1 070629 15:34:18 [ERROR] Slave: It was not possible to update the positions of the relay log information: the slave may be in an inconsistent state. Stopped in /export/home/ngb/telco_bug28346/mysql-test/var/log/slave-relay-bin.000003 position 194946, Error_code: 1105 070629 15:34:18 [ERROR] Slave (additional info): Got error 4350 'Transaction already aborted' from NDBCLUSTER Error_code: 1296 070629 15:34:18 [Warning] Slave: Got error 4350 'Transaction already aborted' from NDB Error_code: 1296 070629 15:34:18 [Warning] Slave: Got error 4350 'Transaction already aborted' from NDBCLUSTER Error_code: 1296 070629 15:34:18 [Warning] Slave: Got error 4350 'Transaction already aborted' from NDBCLUSTER Error_code: 1296 070629 15:34:18 [Warning] Slave: Got error 4350 during COMMIT Error_code: 1180 070629 15:34:18 [Warning] Slave: Unknown error Error_code: 1105 070629 15:34:18 [Warning] Slave: Unknown error Error_code: 1105 070629 15:34:18 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'master-bin.000001' position 194800 NDB: Found 3 NdbTransaction's that have not been released NDB: Found 1 NdbReceiver that has not been released NDB: Found 3 NdbTransaction's that have not been released NDB: Found 1 NdbReceiver that has not been released 070629 15:34:49 [Note] Slave: received end packet from server, apparent master shutdown: 070629 15:34:49 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'master-bin.000001' position 197628 070629 15:34:49 [ERROR] Slave I/O thread: error reconnecting to master 'root@127.0.0.1:9306': Error: 'Can't connect to MySQL server on '127.0.0.1' (146)' errno: 2003 retry-time: 1 retries: 10 070629 15:34:49 [Note] /export/home/ngb/telco_bug28346/sql/mysqld: Normal shutdown 070629 15:34:49 [Note] Event Scheduler: Purging the queue. 0 events 070629 15:34:49 [Note] Slave I/O thread killed during or after a reconnect done to recover from failed read 070629 15:34:49 [Note] Slave I/O thread exiting, read up to log 'master-bin.000001', position 197628 070629 15:34:49 [Note] Stopping Cluster Binlog 070629 15:34:49 [Warning] Forcing shutdown of 1 plugins 070629 15:34:49 [Note] Plugin 'ndbcluster' will be forced to shutdown 070629 15:34:49 [Note] Stopping Cluster Utility thread NDB: table share ./mysql/ndb_apply_status with use_count 1 not freed NDB: table share ./tpcb/teller with use_count 1 not freed NDB: table share ./tpcb/branch with use_count 1 not freed NDB: table share ./tpcb/account with use_count 1 not freed NDB: table share ./tpcb/history with use_count 1 not freed NDB: table share ./mysql/ndb_schema with use_count 1 not freed 070629 15:34:51 [Note] /export/home/ngb/telco_bug28346/sql/mysqld: Shutdown complete
[9 Jul 2007 1:32]
Guangbao Ni
for rpl_ndb_mix_innodb If master and slave both run on Solaris, and when master table is INNODB and slave table is NDB, in Field_blob::unpack() and Field_blob::pack(), these two function will both call get_length() function, in uint32 Field_blob::get_length() function, uint32 Field_blob::get_length(const char *pos) { switch (packlength) { case 1: return (uint32) (uchar) pos[0]; case 2: { uint16 tmp; #ifdef WORDS_BIGENDIAN if (table->s->db_low_byte_first) tmp=sint2korr(pos); //master is INNODB, will run this else #endif shortget(tmp,pos); //slave is NDB, will run this return (uint32) tmp; } case 3: return (uint32) uint3korr(pos); case 4: { uint32 tmp; #ifdef WORDS_BIGENDIAN if (table->s->db_low_byte_first) tmp=uint4korr(pos); else #endif longget(tmp,pos); return (uint32) tmp; } } return 0; // Impossible } and from the result of printout, i can see the master prints the result of get_length() is 36 and the slave print out 603979776. it just means the different length byte order.
[9 Jul 2007 9:16]
Guangbao Ni
and I simplify the test case as followings, --disable_query_log --source include/have_ndb.inc --source include/have_innodb.inc --source include/have_binlog_format_mixed.inc --source include/master-slave.inc --enable_query_log let $off_set = 9; let $rpl_format = 'MIX'; # Create database/tables and stored procdures connection master; #--source include/tpcb.inc CREATE DATABASE tpcb; CREATE TABLE tpcb.history (id MEDIUMINT NOT NULL AUTO_INCREMENT, tdate DATETIME, uuidf LONGBLOB, filler CHAR(80),PRIMARY KEY (id)); # Switch tables on slave to use NDB --sync_slave_with_master USE tpcb; ALTER TABLE history ENGINE NDB; --echo # Load DB tpcb and run some transactions connection master; --disable_query_log #CALL tpcb.load(); INSERT INTO tpcb.history VALUES(1, NOW(), UUID(),'completed trans'); --sync_slave_with_master
[10 Jul 2007 7:17]
Guangbao Ni
duplicate with bug#29549