Bug #79077 Binlog contains schema events in incorrect order
Submitted: 2 Nov 2015 14:11 Modified: 1 Dec 2015 16:41
Reporter: Ole John Aske Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:7.4.8 OS:Any
Assigned to: CPU Architecture:Any

[2 Nov 2015 14:11] Ole John Aske
Description:
Seen in (But not limited to):

==================
ndb_binlog.ndb_binlog_ddl_multi          w1 [ fail ]
        Test ended at 2015-10-29 16:25:43

CURRENT_TEST: ndb_binlog.ndb_binlog_ddl_multi
--- /export/home2/pb2/test/sb_1-16905794-1446131492.97/mysql-cluster-gpl-7.3.12-linux-glibc2.5-x86_64/mysql-test/suite/ndb_binlog/r/ndb_binlog_ddl_multi.result	2015-10-29 16:46:09.000000000 +0300
+++ /ramdisk/mtr-23582/var-n_mix/1/log/ndb_binlog_ddl_multi.reject	2015-10-29 18:25:43.000000000 +0300
@@ -46,13 +46,13 @@
 create table t1 (a int primary key) engine=ndb;
 show binlog events from <binlog_start>;
 Log_name	Pos	Event_type	Server_id	End_log_pos	Info
+mysqld-bin.000001	#	Query	2	#	drop database mysqltest
 mysqld-bin.000001	#	Query	2	#	BEGIN
 mysqld-bin.000001	#	Table_map	2	#	table_id: # (test.t2)
 mysqld-bin.000001	#	Table_map	2	#	table_id: # (mysql.ndb_apply_status)
 mysqld-bin.000001	#	Write_rows	2	#	table_id: #
 mysqld-bin.000001	#	Write_rows	2	#	table_id: # flags: STMT_END_F
 mysqld-bin.000001	#	Query	2	#	COMMIT
-mysqld-bin.000001	#	Query	2	#	drop database mysqltest
 mysqld-bin.000001	#	Query	2	#	use `test`; create table t1 (a int primary key) engine=ndb
 drop table t2;
 reset master;
=======

The problem here seems to be a couple of assumption done in the binlog injector thread:

Assumption1: pollEvent() Will wake up immediately when an epoch has complteted, and
thus will only contain events from within a single epoch.

Wrong: The thread doing the poll can have been suspended for a longer time before the
poll such that ::trp_deliver() has already received multiple completed epoch before
pollEvent() is called. They will all be made available by the poll.

Assumption2: Another pollEvent() on the schema_ndb (s_ndb) after polling the 'i_ndb'
will not return events from any epochs after the epoch already returned by polling
the 'i_ndb'

Wrong: The thread doing this poll can have been suspended between the two poll
such that an arbitrarily number of epochs can have been completed and delivered
into the schema eventbuffer before the other poll.

The binlog injector applies these events, without further checking of which epoch
they belongs to. All schema events are applied first, then all 'normal' events.

If the polling of schema events included later epochs than the polling of the
'i_ndb', these (late) schema events are inserted *before* the non-schema events
instead of after them.

How to repeat:
Insert some random NdbSleep in the binlog injector loop before or between the 
calls to pollEvents()

Suggested fix:
Apply events one epoch at a time, starting with the lowest epoch first.
[1 Dec 2015 16:41] Jon Stephens
Documented fix in the NDB 7.2.23, 7.3.12, and 7.4.9 changelogs, as follows:

    Schema events were appended to the binary log out of order
    relative to non-schema events. This was caused by the fact that
    the binlog injector did not properly handle the case where
    schema events and non-schema events were from different epochs.

    This fix modifies the handling of events from the two schema and
    non-schema event streams such that events are now always handled
    one epoch at a time, starting with events from the oldest
    available epoch, without regard to the event stream in which
    they occur.

Closed.