Bug #91744 START SLAVE UNTIL going further than it should.
Submitted: 20 Jul 15:27 Modified: 26 Sep 4:23
Reporter: Jean-François Gagné Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.7.22, 5.7.23 OS:Any
Assigned to: CPU Architecture:Any

[20 Jul 15:27] Jean-François Gagné
Description:
Hi,

I found a case where START SLAVE UNTIL is not stopping at the expected position but a little further.  In my case, it is not a big problem (because it is happening on the primary of a master/master setup with only writes on the primary), but this is very scary (I do not know what would happen if writes would be pointed to both masters).  More details in How to repeat.

Many thanks for looking into that,

JFG

How to repeat:
I have a master/master (cyclic) replication setup like this (details in the file setup.txt in the comments) :

- Writes happen on “master” (M),
- “slave1” (S1) replicates from M,
- M replicates from S1,
- and I have “slave2” (S2) replicating from master.

I want to add S2 to the cycle for having M replicating from S2, S1 replicating from M, and S2 replicating from S1.

Note: I am not using GTID or Parallel Replication.  If I would, commands below would be different.

To achieve above, I do (details in the file repointing.txt in the comments):

1) On M: STOP SLAVE; SHOW SLAVE STATUS\G
2) On S2:
  2.1) Making sure it is ahead of M: SELECT MASTER_POS_WAIT(<the position of SHOW SLAVE STATUS from M in #1>);
  2.2: STOP SLAVE; SHOW SLAVE STATUS\G SHOW MASTER STATUS\G START SLAVE;
3) Back on M: 
  3.1) START SLAVE UNTIL <the position of SHOW SLAVE STATUS from S2 in #2.2>;
  3.2) SELECT MASTER_POS_WAIT(<the position of START SLAVE UNTIL in #3.1);
  3.3) STOP SLAVE; CHANGE MASTER TO <the position of SHOW MASTER STATUS from S2 in #2.2>; START SLAVE; 

But being paranoid, I did a SHOW SLAVE STATUS\G between #3.2 and #3.3 and I did not like what I saw (all details in repointing.txt):

master [localhost] {msandbox} (test) > START SLAVE UNTIL MASTER_LOG_FILE = 'mysql-bin.000002', MASTER_LOG_POS = 143517;
Query OK, 0 rows affected, 1 warning (0.01 sec)

master [localhost] {msandbox} (test) > SELECT MASTER_POS_WAIT('mysql-bin.000002', 143517);
+---------------------------------------------+
| MASTER_POS_WAIT('mysql-bin.000002', 143517) |
+---------------------------------------------+
|                                        NULL |
+---------------------------------------------+
1 row in set (0.00 sec)

master [localhost] {msandbox} (test) > SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: rsandbox
                  Master_Port: 16746
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 233734
               Relay_Log_File: mysql-relay.000007
                Relay_Log_Pos: 320
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
[...]
          Exec_Master_Log_Pos: 175953 .    <<<<<=====----- This does not match Until_Log_Pos below and START SLAVE UNTIL ABOVE.
              Relay_Log_Space: 689
              Until_Condition: Master
               Until_Log_File: mysql-bin.000002
                Until_Log_Pos: 143517
[...]
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 200
                  Master_UUID: 00016746-2222-2222-2222-222222222222
             Master_Info_File: /home/jgagne/sandboxes/rsandbox_5_7_22/master/data/master.info
[...]
1 row in set (0.00 sec)

See how Exec_Master_Log_Pos is larger than what was asked in the START SLAVE UNTIL.  I would expect to have exactly the value from START SLAVE UNTIL.
[20 Jul 15:27] Jean-François Gagné
Cyclic replication setup.

Attachment: setup.txt (text/plain), 8.77 KiB.

[20 Jul 15:27] Jean-François Gagné
Repointing details.

Attachment: repointing.txt (text/plain), 8.38 KiB.

[20 Jul 15:28] Jean-François Gagné
Updating version to fix bad cut-and-paste.
[26 Sep 4:23] Umesh Shastry
Hello Jean-François,

Thank you for the report.

regards,
Umesh