| Bug #31646 | rpl_ndb_extraColMaster with row-based replication on sparc: slave crash | ||
|---|---|---|---|
| Submitted: | 16 Oct 2007 18:34 | Modified: | 5 Mar 2008 14:47 |
| Reporter: | Timothy Smith | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Cluster: Replication | Severity: | S2 (Serious) |
| Version: | 5.2 | OS: | Solaris (sol10-sparc-a) |
| Assigned to: | Mats Kindahl | CPU Architecture: | Any |
| Tags: | pbfail | ||
[23 Oct 2007 8:50]
Magnus Blåudd
$>dbx ../sql/mysqld var/slave-data/core (dbx) threads <snip> o t@23 b l@23 handle_slave_sql() signal SIGSEGV in _lwp_kill() (dbx) thread t@23 t@23 (l@23) stopped in _lwp_kill at 0xfeebd5ec 0xfeebd5ec: _lwp_kill+0x0008: bcc,a,pt %icc,_lwp_kill+0x18 ! 0xfeebd5fc (dbx) where current thread: t@23 =>[1] _lwp_kill(0x0, 0xb, 0xfeee8bc0, 0xfed45000, 0xfeee83a8, 0xb), at 0xfeebd5ec [2] write_core(0xb, 0x50, 0xb, 0x6a2c, 0x1a8b, 0x7b1af8), at 0x315204 [3] handle_segfault(0xb, 0x97, 0x5fff8, 0x57c774, 0x7b1af8, 0xd5860), at 0x1f3be4 [4] __sighndlr(0xb, 0x0, 0xfeadf7e0, 0x1f39e0, 0x0, 0x0), at 0xfeebc52c ---- called from signal handler with signal 11 (SIGSEGV) ------ [5] Field_blob::get_length(0xfeadfb60, 0x7203350a, 0x1, 0x7a1400, 0x647da4, 0x4), at 0x1d9bd4 [6] table_def::calc_field_size(0x9664a0, 0x16, 0x7203350a, 0x966478, 0xfeadfb60, 0x4), at 0x3104f8 [7] unpack_row(0xc49218, 0x10c3080, 0x3, 0xc3f664, 0xfd8f4c, 0xfd8fb0), at 0x2a9ec4 [8] Rows_log_event::write_row(0xfd8f10, 0xc991f8, 0x1, 0x1, 0x0, 0xf72c88), at 0x2a738c [9] Write_rows_log_event::do_exec_row(0xfd8f10, 0xc991f8, 0xf69848, 0xf69848, 0xf6a5c8, 0x1), at 0x2a76b0 [10] Rows_log_event::do_apply_event(0xfd8f10, 0xc991f8, 0x5895f7, 0xf69848, 0x10c3080, 0x0), at 0x2a600c [11] exec_relay_log_event(0xf69848, 0xc991f8, 0x0, 0x0, 0x1400, 0x155c), at 0x308e28 [12] handle_slave_sql(0x58f9e2, 0x7b1d08, 0x1, 0xc9a080, 0xc991f8, 0x1400), at 0x309cfc (dbx)
[24 Oct 2007 11:59]
Magnus Blåudd
$> cat rpl_ndb_bug31646.test
-- source include/have_ndb.inc
-- source include/ndb_master-slave.inc
-- source include/have_binlog_format_row.inc
connection master;
CREATE TABLE t1 (f1 INT, f2 INT, f3 INT PRIMARY KEY, f4 CHAR(20),
/* extra columns only available on master*/
f5 FLOAT DEFAULT '2.00',
f6 CHAR(4) DEFAULT 'TEST',
f7 INT DEFAULT '0',
f8 TEXT,
f9 LONGBLOB,
f10 BIT(63),
f11 VARBINARY(64))ENGINE=NDB;
sync_slave_with_master;
# Drop columns f5-f11 from the table on _slave_
alter table t1 drop f5, drop f6, drop f7, drop f8, drop f9, drop f10, drop f11;
connection master;
INSERT INTO t1 VALUES (1, 1, 1, 'second', 2.0, 'kaks', 2,
'got stolen from the paradise',
'very fat blob', b'01010101010101',
0x123456);
sync_slave_with_master; # <= Crash occurs on slave mysqld
$> ./mtr rpl_ndb_bug31646 --mysqld=--binlog-format=row
<snip>
rpl_ndb.rpl_ndb_bug31646 [ fail ]
mysqltest: At line 26: failed in 'select master_pos_wait('master-bin.000001', 850)': 2013: Lost connection to MySQL server during query
analyzing the core gives same backtrace as earlier
[24 Oct 2007 12:00]
Magnus Blåudd
mysqltest test file used to repeat bug
Attachment: rpl_ndb_bug31646.test (application/octet-stream, text), 957 bytes.
[6 Dec 2007 15:28]
Jon Stephens
Please verify this bug's status and set to Documenting or Closed if it really has been fixed and pushed. Thanks!
[5 Mar 2008 14:47]
Jon Stephens
Documented in the 5.1.23 and 6.0.4 changelogs as follows:
Cluster replication with extra VARCHAR columns on the
master caused the slave to fail.

Description: mysqltest: In included file "./extra/rpl_tests/rpl_extraMaster_Col.test": At line 407: failed in 'select master_pos_wait('master-bin.000001', 10449)': 2006: MySQL server has gone away The result from queries just before the failure was: < snip > /* f27 CHAR(255), */ 'three', /* f28 DECIMAL(30,7), */ 3.333, /* f29 DOUBLE, */ 3.333333333333333333333333333, /* f30 ENUM ('a','b','c')*/ 'c', /* f31 FLOAT, */ 3.0, /* f32 NUMERIC(17,9), */ 3.3333, /* f33 SET ('a','b','c'),*/ 'c', /*f34 VARBINARY(1025),*/ '3333 minus 3', /*f35 VARCHAR(257),*/ NULL ); Warnings: Warning 1264 Out of range value for column 'f5' at row 1 Warning 1264 Out of range value for column 'f24' at row 1 Warning 1264 Out of range value for column 'f5' at row 2 Warning 1264 Out of range value for column 'f24' at row 2 Warning 1264 Out of range value for column 'f24' at row 3 ** Sync slave with master ** ** Do selects from tables ** The following appears in the slave.err log file: ... 071016 15:06:05 [Note] NDB Binlog: logging ./test/t3 071016 15:06:07 [Note] NDB Binlog: CREATE TABLE Event: REPL$test/t4 071016 15:06:07 [Note] NDB Binlog: logging ./test/t4 071016 15:06:12 [Note] NDB Binlog: CREATE TABLE Event: REPL$test/t31 071016 15:06:12 [Note] NDB Binlog: logging ./test/t31 071016 15:06:14 [Note] NDB Binlog: RENAME Event: REPL$test/t2 071016 15:06:14 [Note] NDB Binlog: logging ./test/t2 071016 15:06:16 [Note] NDB Binlog: drop table ./test/#sql2-109e-2. 071016 15:06:18 [Note] NDB Binlog: RENAME Event: REPL$test/t3 071016 15:06:18 [Note] NDB Binlog: logging ./test/t3 071016 15:06:20 [Note] NDB Binlog: drop table ./test/#sql2-109e-2. 071016 15:06:21 [Note] NDB Binlog: RENAME Event: REPL$test/t4 071016 15:06:21 [Note] NDB Binlog: logging ./test/t4 071016 15:06:23 [Note] NDB Binlog: drop table ./test/#sql2-109e-2. 071016 15:06:25 [Note] NDB Binlog: RENAME Event: REPL$test/t31 071016 15:06:25 [Note] NDB Binlog: logging ./test/t31 071016 15:06:29 [Note] NDB Binlog: drop table ./test/#sql2-109e-2. 071016 15:06:29 - mysqld got signal 11; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=1048576 read_buffer_size=131072 max_used_connections=2 max_threads=151 threads_connected=2 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 59859 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Writing a core file How to repeat: Pushbuild shows this failure occasionally since mid September 2007, in 5.2 trees on sol10-sparc-a platform only: https://intranet.mysql.com/secure/pushbuild/xref.pl?startdate=&enddate=&dir=&plat=&testtyp...