Bug #76023 CLUSTER CIRCULAR REPLICATION WITH IGNORE_SERVER_IDS() BROKEN BY ANONYMOUS_GTIDS
Submitted: 24 Feb 2015 11:23 Modified: 9 Jul 2015 9:38
Reporter: Magnus Blåudd Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.7.6 OS:Any
Assigned to: Magnus Blåudd CPU Architecture:Any

[24 Feb 2015 11:23] Magnus Blåudd
Description:
Circular 2 channel cluster replication is broken and it's no longer possible to wait for the second pair of mysqld(s) to synch their replication. This is since the Exec_master_log_pos counter will never reach the value of end position in masters binlog. 

The binlog for this "second pair" is generated by a mysqld which listens to what's going on in the cluster and injects these changes to the binlog. The affected query is a DROP TABLE and is not injected using rpl_injector API.

Suspicion is that the Anonymous_Gtid event is logged with wrong server_id(i.e not using the server_id of the originating mysqld but it's own) but it may also be the case that it's just the last statement in the binlog.

How to repeat:
Circular cluster replication with 2 channels.

mysql> DROP TABLE t1;

   [master]
    server_id: 1
     -> binlog ------------> [slave]               
                              server_id: 2
                                |
                                v
    (cluster1)            (cluster2)
                                |
                                v  
                             [slave1]
                              server_id: 4
                                |
                                v
    [master1] <---------------binlog
    server_id: 3
     ^^
     This mysqld ignores changes from server_id: 1
     ignores the DROP TABLE t1
     STOP SLAVE;
     CHANGE MASTER IGNORE_SERVER_IDS();
     START SLAVE;
     Now it replays the DROP TABLE t1 and SQL thread stops!
     this has not happened before!
     Suspicion is that the Anonyous_Gtid with server_id 4 causes the problem

a) master -> slave

SHOW BINLOG EVENTS;
<snip>
master-bin.000001       22425   Anonymous_Gtid  1       22490   SET @@SESSION.GTID_NEXT= 'ANONYMOUS'
master-bin.000001       22490   Query   1       22610   use `test`; DROP TABLE `t1` /* generated by server */

The slave reads and executes the above two events since it does NOT ignore server_id 1.

SHOW SLAVE STATUS;
<snip>
Read_Master_Log_Pos     22610
Exec_Master_Log_Pos     22610

Also, there is another mysqld(slave1) connected to the same cluster as slave. It generates another binlog which will be read by another mysqld(master1) connected to the first cluster.

b) slave1 -> master1

SHOW BINLOG EVENTS;
<snip>
master-bin.000001       20453   Anonymous_Gtid  4       20518   SET @@SESSION.GTID_NEXT= 'ANONYMOUS'
master-bin.000001       20518   Query   1       20607   use `test`; drop table `test`.`t1`

The master1 mysqld does not read and execute these two last statements.

SHOW SLAVE STATUS;
<snip>
Read_Master_Log_Pos     20607
Exec_Master_Log_Pos     20453

This means that if the master1 mysqld replication is stopped, the IGNORE_SERVER_IDS() removed and then restarted those two last statements will be replayed and replication will stop since no such table exists.

Suggested fix:
Looking at the code in ha_ndbcluster_binlog.cc it surprises me that these DDL statements are binlogged using THD::binlog_query() and not the rpl_injector API. But I assume the latter is strictly for row logging. Since the THD::binlog_query() does not have any parameter for which server_id this statement came from the caller set THD::server_id before logging the query. Could it be the case that this THD::server_id is not honoured when writing the Anonymous_gtid?

Simplified example:
    thd->server_id = loggedServerId;
    thd->reset_db("test");
    int errcode = query_error_code(thd, thd->killed == THD::NOT_KILLED);
    thd->binlog_query(THD::STMT_QUERY_TYPE, "DROP TABLE t1",
                      27, FALSE,
                      TRUE,
                      schema->name[0] == 0 || thd->db().str[0] == 0,
                      errcode);

    // Commit the binlog write
    (void)trans_commit_stmt(thd);

Alternatively this behaviour could be "by design" and in such case I'd like an explanation how it works so it can be evaluated if this behaviour is suitable.
[26 Feb 2015 8:59] Sven Sandberg
Posted by developer:
 
From the output you paste it looks like the following happens:

1. master issues DROP TABLE and writes to its binlog.
2. slave replicates DROP TABLE.
3. slave1 writes DROP TABLE to binlog.
4. master1 receives DROP TABLE but discards it before writing to the relay log, because both Anonymous_log_event and Query_log_event(DROP) use server_id=1, and IGNORE_SERVER_IDS=(1) on master1. master1 does update the position of the receiver thread (aka IO thread), since Read_master_log_pos=20607 and on slave1's binlog the Query_log_event ends at position 20607. The applier thread (aka SQL thread) has not update the position, but this should not be a problem. There is currently no mechanism for the receiver thread to update the position of the receiver thread when events have been ignored by IGNORE_SERVER_IDS.

So far all looks ok. Since master1's receiver thread is past the Query_log_event(DROP), and the Query_log_event is not in the relay log, the next event that master1 should receive is the one after the Query_log_event(DROP).

So the question is what happens next. Why does master1 receive the Query_log_event(DROP), when it was positioned after the event? I guess at some point your test case starts a slave thread that receives the event. What positions does SHOW SLAVE STATUS report just before this START SLAVE command?
[9 Jul 2015 9:38] Jon Stephens
Fixed in the NDB-next branch, currently tagged mysql-5.7.8-ndb-7.5.0.

This issue occurs only when using NDB storage engine with 5.7 Server but is not
reported in any supported release. Since the fix merely restores the expected
behaviour, there is also no user-visible behaviour change to document.

Closed without further action.