Bug #41183 rpl_ndb_circular, rpl_ndb_circular_simplex need maintenance, crash
Submitted: 2 Dec 2008 19:17 Modified: 11 Feb 2009 16:25
Reporter: Matthias Leich Email Updates:
Status: Closed Impact on me:
None 
Category:Tests: Cluster Severity:S3 (Non-critical)
Version:5.1 OS:Any
Assigned to: Andrei Elkin CPU Architecture:Any

[2 Dec 2008 19:17] Matthias Leich
Description:
rpl_ndb_circular and rpl_ndb_circular_simplex are
since some time disabled in MySQL 5.1 because of
   Bug#33849 COMMIT event missing in cluster

BUG#33849 is closed by Andrei Elkin.
A comment within this bug states:
    [27 Jun 12:31] Konstantin Osipov
    The bug was fixed by the fix for Bug#36197
    Bug#36197 is fixed.

Both tests were again enabled in MySQL 6.0.
But t/disabled.def in MySQL 5.1 was not changed.

My attempts to run both tests in 5.1 failed with
a crash of the master server during execution of
a "COMMIT".
./mysql-test-run.pl --mem --mysqld=--binlog-format=row --testcase-timeout=20 --suite=rpl_ndb rpl_ndb_circular
...
TEST                           RESULT         TIME (ms)
-------------------------------------------------------
mysql-test-run: WARNING: Forcing kill of process 12330
mysql-test-run: WARNING: Forcing kill of process 12347
rpl_ndb.rpl_ndb_circular       [ fail ]

mysqltest: At line 34: failed in 'show engine ndb status': 2006 MySQL server has gone away

The result from queries just before the failure was:
stop slave;
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
reset master;
reset slave;
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
start slave;
RESET MASTER;
CHANGE MASTER TO master_host="127.0.0.1",master_port=SLAVE_PORT,master_user="root";
START SLAVE;
CREATE TABLE t1 (a int key, b int) ENGINE=ndb;
SHOW TABLES;
Tables_in_test
t1
INSERT INTO t1 VALUES (1,2);
INSERT INTO t1 VALUES (2,3);

I will attach my master.err.
Snip of its content:
081202 21:55:50 [Note] NDB Binlog: logging ./test/t1
mysqld: handler.cc:1081: int ha_commit_trans(THD*, bool): Assertion `thd->transaction.stmt.ha_list == __null || trans == &thd->transaction.stmt' failed.
081202 21:55:52 - mysqld got signal 6 ;

thd: 0x12ba538
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x40af80f0 thread_stack 0x40000
/work2/5.1/mysql-5.1-bugteam-combined/sql/mysqld(my_print_stacktrace+0x32)[0xaf204e]
/work2/5.1/mysql-5.1-bugteam-combined/sql/mysqld(handle_segfault+0x28a)[0x6b0579]
/lib64/libpthread.so.0[0x7fad6bd84b30]
/lib64/libc.so.6(gsignal+0x35)[0x7fad6af8a5c5]
/lib64/libc.so.6(abort+0x183)[0x7fad6af8bbb3]
/lib64/libc.so.6(__assert_fail+0xe9)[0x7fad6af831e9]
/work2/5.1/mysql-5.1-bugteam-combined/sql/mysqld(_Z15ha_commit_transP3THDb+0x107)[0x7f2161]
/work2/5.1/mysql-5.1-bugteam-combined/sql/mysqld(_Z9end_transP3THD25enum_mysql_completiontype+0x131)[0x6c15ae]
...

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x12cb48d = COMMIT
thd->thread_id=8
thd->killed=NOT_KILLED

My environment:
mysql-5.1-bugteam  last changeset ~ Nov 2008
Linux OpenSuSE 11.0 64 Bit, Intel Core2Duo 64 Bit

How to repeat:
See above

Suggested fix:
1. If this crash happens because some prerequisites
   needed are missing
   a) modify the server so that the poor user gets
      a hint what is missing within master.err
   b) modify the test so that
      - it is checked if the prerequisites exist
        (the usual include/have_...)
      or
      - the prerequisites are created
2. If this is a server bug
   Please adjust t/disabled.def so that it points
   to this bug.
[2 Dec 2008 20:17] Matthias Leich
master.err

Attachment: master.err (application/octet-stream, text), 10.08 KiB.

[2 Dec 2008 20:21] Matthias Leich
I am unsure if my Triage values are correct.
Please feel free correct it.
[17 Dec 2008 16:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/61883

2737 Andrei Elkin	2008-12-17
      Bug #41183  rpl_ndb_circular, rpl_ndb_circular_simplex need maintenance, crash
      
      The bug happened because filtering-out a STMT_END_F-flagged event so that
      COMMIT operations find traces of not-complete statement commit.
      
      Fixed with deploying an additional call cleanup_context() at executing of 
      COMMIT event.
      The not-committed statement can happen only with NDB and that one can be only for the
      first query of the transaction. Moreover, in the case of filtering out per SERVER_ID all
      following group of events will be filtered-out without creating a statement to execute 
      on the slave.
      Hence, the commit time clean-up can deal only with one non-committed statement to prove
      safety.
[18 Dec 2008 12:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/61962

2737 Andrei Elkin	2008-12-18
      Bug #41183  rpl_ndb_circular, rpl_ndb_circular_simplex need maintenance, crash
            
      The bug happened because filtering-out a STMT_END_F-flagged event so that
      the transaction COMMIT finds traces of incomplete statement commit.
      Such situation is only possible with ndb circular replication. The filtered-out
      rows event is one that immediately preceeds the COMMIT query event.
      
      Fixed with deploying an the rows-log-event statement commit at executing
      of the transaction COMMIT event. 
      Resources that were allocated by other than STMT_END_F-flagged event of the last statement are clean up prior execution of the commit logics.
[18 Dec 2008 13:16] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/61975

2737 Andrei Elkin	2008-12-18
      Bug #41183  rpl_ndb_circular, rpl_ndb_circular_simplex need maintenance, crash
            
      The bug happened because filtering-out a STMT_END_F-flagged event so that
      the transaction COMMIT finds traces of incomplete statement commit.
      Such situation is only possible with ndb circular replication. The filtered-out
      rows event is one that immediately preceeds the COMMIT query event.
      
      Fixed with deploying an the rows-log-event statement commit at executing
      of the transaction COMMIT event. 
      Resources that were allocated by other than STMT_END_F-flagged event of
      the last statement are clean up prior execution of the commit logics.
[4 Feb 2009 10:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65108

2768 Andrei Elkin	2009-02-04
      Bug #41183  rpl_ndb_circular, rpl_ndb_circular_simplex need maintenance, crash
                  
      The bug happened because filtering-out a STMT_END_F-flagged event so that
      the transaction COMMIT finds traces of incomplete statement commit.
      Such situation is only possible with ndb circular replication. The filtered-out
      rows event is one that immediately preceeds the COMMIT query event.
            
      Fixed with deploying an the rows-log-event statement commit at executing
      of the transaction COMMIT event. 
      Resources that were allocated by other than STMT_END_F-flagged event of
      the last statement are clean up prior execution of the commit logics.
[4 Feb 2009 11:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65121

2769 Andrei Elkin	2009-02-04
      Bug #41183  rpl_ndb_circular, rpl_ndb_circular_simplex need maintenance, crash
      
      fixing build issue, caused by the previous push.
[4 Feb 2009 12:00] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65132

3011 Andrei Elkin	2009-02-04 [merge]
      Merge 5.1-bt to 6.0-bt local branch for bug#41183
[4 Feb 2009 12:10] Andrei Elkin
Pushed to bt: 5.1,6.0.
[9 Feb 2009 22:33] Bugs System
Pushed into 5.1.32 (revid:davi.arnaut@sun.com-20090209214102-gj3sb3ujpnvpiy4c) (version source revid:davi.arnaut@sun.com-20090209214102-gj3sb3ujpnvpiy4c) (merge vers: 5.1.32) (pib:6)
[11 Feb 2009 16:25] Jon Stephens
Test failure only. No user-facing changes. Closed without further action.
[14 Feb 2009 13:01] Bugs System
Pushed into 6.0.10-alpha (revid:matthias.leich@sun.com-20090212211028-y72faag15q3z3szy) (version source revid:aelkin@mysql.com-20090204115811-of0oca2ezkagri7j) (merge vers: 6.0.10-alpha) (pib:6)
[17 Feb 2009 14:58] Bugs System
Pushed into 5.1.32-ndb-6.3.23 (revid:tomas.ulin@sun.com-20090217131017-6u8qz1edkjfiobef) (version source revid:tomas.ulin@sun.com-20090216083408-rmvyaxjt6mk8sg1y) (merge vers: 5.1.32-ndb-6.3.23) (pib:6)
[17 Feb 2009 16:46] Bugs System
Pushed into 5.1.32-ndb-6.4.3 (revid:tomas.ulin@sun.com-20090217134419-5ha6xg4dpedrbmau) (version source revid:tomas.ulin@sun.com-20090216083646-m8st11oj1hhfuuh5) (merge vers: 5.1.32-ndb-6.4.3) (pib:6)
[17 Feb 2009 18:22] Bugs System
Pushed into 5.1.32-ndb-6.2.17 (revid:tomas.ulin@sun.com-20090217134216-5699eq74ws4oxa0j) (version source revid:tomas.ulin@sun.com-20090211111208-wf0acl7c1vl5653e) (merge vers: 5.1.32-ndb-6.2.17) (pib:6)