Bug #20742 Assertion in drop of ndb binlog events after node restart
Submitted: 28 Jun 2006 2:02 Modified: 4 Jul 2006 23:14
Reporter: Nikolay Grishakin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:5.1 OS:Linux (Linux)
Assigned to: Tomas Ulin CPU Architecture:Any

[28 Jun 2006 2:02] Nikolay Grishakin
Description:
Got the following error running ndb_mgm utility test: "mysqltest: At line 125: query 'DROP LOGFILE GROUP lg ENGINE = NDB' failed: 2013: Lost connection to MySQL server during query". Also got a core dump in ./var/master-data/ directory.
Here is back trace.

[ndbdev@ndb15 mysql-test]$ gdb ../sql/mysqld ./var/master-data/core.18210
GNU gdb Red Hat Linux (6.3.0.0-1.84rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db library "/lib64/libth
read_db.so.1".

Core was generated by `/home/ndbdev/ngrishakin/mysql-5.1/sql/mysqld --no-defaults --console --basedi
r='.
Program terminated with signal 6, Aborted.
Reading symbols from /usr/lib64/libz.so.1...done.
Loaded symbols for /usr/lib64/libz.so.1
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libpthread.so.0...done.
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libcrypt.so.1...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /lib64/libnsl.so.1...done.
Loaded symbols for /lib64/libnsl.so.1
Reading symbols from /lib64/libm.so.6...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...done.
Loaded symbols for /lib64/libnss_files.so.2
#0  0x00000033b8c096a7 in pthread_kill () from /lib64/libpthread.so.0

(gdb) bt
#0  0x00000033b8c096a7 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000000000819930 in write_core (sig=6) at stacktrace.c:220
#2  0x00000000006444a4 in handle_segfault (sig=6) at mysqld.cc:2175
#3  <signal handler called>
#4  0x00000033b832f280 in raise () from /lib64/libc.so.6
#5  0x00000033b8330750 in abort () from /lib64/libc.so.6
#6  0x00000033b83282e6 in __assert_fail () from /lib64/libc.so.6
#7  0x0000000000a90868 in NdbEventBuffer::deleteUsedEventOperations (this=0x1d97c10)
    at NdbEventOperationImpl.cpp:1258
#8  0x0000000000a90c66 in NdbEventBuffer::nextEvent (this=0x1d97c10)
    at NdbEventOperationImpl.cpp:1227
#9  0x0000000000a66f3f in Ndb::nextEvent (this=0x1d95160) at Ndb.cpp:1479
#10 0x00000000007b6c45 in ndb_binlog_thread_func (arg=0x0) at ha_ndbcluster_binlog.cc:3571
#11 0x00000033b8c0697c in start_thread () from /lib64/libpthread.so.0
#12 0x00000033b83c992e in clone () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()
(gdb)

How to repeat:
-- source include/have_ndb.inc
--disable_warnings
DROP DATABASE IF EXISTS util;
DROP TABLE IF EXISTS util.t1;
--enable_warnings

CREATE DATABASE util;

CREATE LOGFILE GROUP lg
ADD UNDOFILE 'undofile.dat'
INITIAL_SIZE 16M
UNDO_BUFFER_SIZE = 1M
ENGINE=NDB;

CREATE TABLESPACE ts
ADD DATAFILE 'datafile.dat'
USE LOGFILE GROUP lg
INITIAL_SIZE 12M
ENGINE NDB;

CREATE TABLE util.t1
(a1 INT NOT NULL PRIMARY KEY, a2 CHAR(50)) TABLESPACE ts STORAGE DISK ENGINE=NDB;

let $j= 500;
--disable_query_log
while ($j)
{
  eval INSERT INTO util.t1 VALUES ($j, "aaaaaaaaaaaa$j");
  dec $j;
}
--enable_query_log
SELECT COUNT(*) FROM util.t1;

--echo ******** Restart each node *****************
--echo **** Restart node 2 and check status *****
--echo ndb_mgm <id=2> restart
--exec $NDB_MGM localhost:$NDBCLUSTER_PORT --execute="2 restart"
--sleep 20

--echo ndb_mgm ALL STATUS
--exec $NDB_MGM localhost:$NDBCLUSTER_PORT --execute="ALL STATUS"

--echo **** Restart node 1 and check status *****
--echo ndb_mgm <id=1> restart
--exec $NDB_MGM localhost:$NDBCLUSTER_PORT --execute="1 restart"
--sleep 20

--echo ndb_mgm ALL STATUS
--exec $NDB_MGM localhost:$NDBCLUSTER_PORT --execute="ALL STATUS"

DROP TABLE util.t1;

ALTER TABLESPACE ts
DROP DATAFILE 'datafile.dat'
ENGINE = NDB;

DROP TABLESPACE ts
ENGINE = NDB;

DROP LOGFILE GROUP lg
ENGINE = NDB;
[28 Jun 2006 2:13] Nikolay Grishakin
Dump file and cluster logs copied to ndbdev@ndbmaster.mysql.com:/bugs/bug20742/.
ndb_XX_error.log and ndb_XX_trace.log files were not found.
[28 Jun 2006 2:17] Nikolay Grishakin
master.log file has the following info:

mysqld: NdbEventOperationImpl.cpp:1258: void NdbEventBuffer::deleteUsedEventOperations(): Assertion
`op->m_ref_count > 0' failed.
mysqld got signal 6;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=2
max_connections=100
threads_connected=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 39420 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
[4 Jul 2006 8:33] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/8681
[4 Jul 2006 13:47] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/8698
[4 Jul 2006 13:49] Tomas Ulin
following reproduces easily:

ndb/ndbcluster --initial --small --diskless
./mysql-test-run.pl --debug --mysqld=--ndbcluster --start-and-exit alias

create table t1 (a int) engine=ndb;
create table t2 (a int) engine=ndb;
create table t3 (a int) engine=ndb;
create table t4 (a int) engine=ndb;
create table t5 (a int) engine=ndb;
create table t6 (a int) engine=ndb;
insert into t1 values (1);
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t2 select * from t1;
insert into t3 select * from t1;
insert into t4 select * from t1;
insert into t5 select * from t1;
insert into t6 select * from t1;

- restart one node

drop table t1, t2,t3,t4,t5,t6;
[4 Jul 2006 14:08] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/8700
[4 Jul 2006 23:14] Tomas Ulin
pushed to 5.1.12, no need to document, bug not in released