Bug #105400 replication_sender_observe_commit_only=ON leads to break semisync replication
Submitted: 1 Nov 2021 15:40 Modified: 3 Nov 2021 7:06
Reporter: Tsubasa Tanaka (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:8.0.27 OS:Debian
Assigned to: CPU Architecture:x86
Tags: MY-010207, MY-011178, MY-013117

[1 Nov 2021 15:40] Tsubasa Tanaka
Description:
Setting replication_sender_observe_commit_only into semisync-Source node, its Replica could not start semisync-replication after restart mysqld.

Because Binlog_sender::before_send_hook doesn't trigger repl_semi_before_send_event  when replication_sender_observe_commit_only=ON, semisync packet header is not in binary log.

Observer_transmission_guard::Observe_transmission_guard 
(opt_replication_sender_observe_commit_only == false then m_observer_transmission == false)

->Binlog_sender::before_send_hook (call repl_semi_before_send_event when m_observer_transmission == true)

-->repl_semi_before_send_event // Didn't be triggered
--->updateSyncHeader (magic number for semisync packet is included here // Didn't be triggered

How to repeat:
### Setup async replication

$ mysqlsh
 MySQL  JS > dba.deploySandboxInstance(3306)
 MySQL  JS > dba.deploySandboxInstance(3307)

 MySQL  JS > \c root@localhost:3306
 MySQL  localhost:3306 ssl  JS > rs = dba.createReplicaSet('myrep')
 MySQL  localhost:3306 ssl  JS > rs.addInstance('root@localhost:3307')
 MySQL  localhost:3306 ssl  JS > \q
Bye!

### Add semisync replication settings, and restart MySQLs to apply them

$ cat << EOF >> ~/mysql-sandboxes/3306/my.cnf
[mysqld]
plugin_load=rpl_semi_sync_source=semisync_source.so
rpl_semi_sync_source_enabled= ON
replication_sender_observe_commit_only
EOF

$ cat << EOF >> ~/mysql-sandboxes/3307/my.cnf
[mysqld]
plugin_load=rpl_semi_sync_replica=semisync_replica.so
rpl_semi_sync_replica_enabled= ON
EOF

$ mysqlsh root@localhost:3306 --sql -e "RESTART"
$ mysqlsh root@localhost:3307 --sql -e "RESTART"

### Restart Replica during INSERTing Source.

$ mysqlslap -h 127.0.0.1 -P 3306 -u root --auto-generate-sql --auto-generate-sql-execute-number=10000

$ mysqlsh root@localhost:3307 --sql -e "RESTART"

### Replica has been broken.

$ mysqlsh root@localhost:3307 --sql
 MySQL  localhost:3307 ssl  SQL > SHOW REPLICA STATUS\G
*************************** 1. row ***************************
             Replica_IO_State:
                  Source_Host: 127.0.0.1
                  Source_User: mysql_innodb_rs_3231262344
                  Source_Port: 3306
                Connect_Retry: 60
              Source_Log_File: pc-001699-1-bin.000003
          Read_Source_Log_Pos: 196
               Relay_Log_File: pc-001699-1-relay-bin.000006
                Relay_Log_Pos: 423
        Relay_Source_Log_File: pc-001699-1-bin.000003
           Replica_IO_Running: No
          Replica_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Source_Log_Pos: 196
              Relay_Log_Space: 856515
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Source_SSL_Allowed: No
           Source_SSL_CA_File:
           Source_SSL_CA_Path:
              Source_SSL_Cert:
            Source_SSL_Cipher:
               Source_SSL_Key:
        Seconds_Behind_Source: NULL
Source_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 13117
                Last_IO_Error: Fatal error: Failed to run 'after_read_event' hook
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Source_Server_Id: 4039881886
                  Source_UUID: e986e20d-3b26-11ec-aa62-00155d7da582
             Source_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
    Replica_SQL_Running_State: Replica has read all relay log; waiting for more updates
           Source_Retry_Count: 86400
                  Source_Bind:
      Last_IO_Error_Timestamp: 211102 00:21:58
     Last_SQL_Error_Timestamp:
               Source_SSL_Crl:
           Source_SSL_Crlpath:
           Retrieved_Gtid_Set: e986e20d-3b26-11ec-aa62-00155d7da582:66-2110
            Executed_Gtid_Set: e986e20d-3b26-11ec-aa62-00155d7da582:1-2110
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name:
           Source_TLS_Version:
       Source_public_key_path:
        Get_Source_public_key: 1
            Network_Namespace:
1 row in set (0.0003 sec)

### Error-Log.

2021-11-01T15:21:58.677561Z 0 [System] [MY-010931] [Server] /home/yoku0825/mysql-sandboxes/3307/bin/mysqld: ready for connections. Version: '8.0.27'  socket: 'mysqld.sock'  port: 3307  MySQL Community Server - GPL.
2021-11-01T15:21:58.708971Z 5 [ERROR] [MY-011178] [Repl] Missing magic number for semi-sync packet, packet len: 45.
2021-11-01T15:21:58.708987Z 5 [ERROR] [MY-010207] [Repl] Run function 'after_read_event' in plugin 'rpl_semi_sync_replica' failed
2021-11-01T15:21:58.708997Z 5 [ERROR] [MY-013117] [Repl] Slave I/O for channel '': Fatal error: Failed to run 'after_read_event' hook, Error_code: MY-013117

Suggested fix:
I hope this is unexpected behavior and be fixed.

But if this behavior is designed, at least described "REPLICA WILL BE RAISE ERROR AND CAN'T CONTINUE REPLICATION WHEN THIS OPTION TURNS ON SOURCE" in the Doc.

https://dev.mysql.com/doc/refman/8.0/en/replication-options-replica.html#sysvar_replicatio...

Workaround:

### Start as async-replication
SET GLOBAL rpl_semi_sync_replica_enabled = 0;
START REPLICA;

### Re-enable as semisync-replication
SET GLOBAL rpl_semi_sync_replica_enabled = 1;
STOP REPLICA;
START REPLICA;
[3 Nov 2021 7:06] MySQL Verification Team
Hello Tanaka-San,

Thank you for the report and feedback!

regards,
Umesh