Bug #103334 Replication starts at wrong position when server-id same is on master and slave
Submitted: 15 Apr 2021 13:52 Modified: 21 Apr 2021 15:34
Reporter: lalit Choudhary Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:8.0,5.7 OS:Any
Assigned to: CPU Architecture:Any

[15 Apr 2021 13:52] lalit Choudhary
Description:
If on in a working master-slave set-up server-id of slave is accidentally set to same as master, and later corrected to be different, SQL thread starts replication at a wrong position.

How to repeat:
Steps to repeat:

1. Use dbdeployer/sandbox to deploy any affected version, master-slave replication.
2. create table test.t1 (id int unsigned primary key auto_increment); on master
Write some data on master: while true; do m -BNe 'insert into test.t1 value(null)' ; done

3. Use delayed replication to have some lag on slave: 
stop slave; change master to master_delay=100; start slave;

4. Execute FLUSH LOGS; on master a few times, so SQL thread is executing different log than IO thread is writing.

5. Stop slave instance, modify config to have same server-id on slave and master, start slave instance.

6. Observe replication fails because of same server-id

7.Correct server-id on slave, and restart slave instance

8. Stop writing on master (ctrl+C the while loop from above), and remove master delay using 
stop slave; change master to master_delay=0; start slave;

9.Wait for replication to catch up, and compare number of rows on master and slave. In my test.t1 table master had 8386 and slave 5684 rows after replication had caught up

Example:
$ make_multiple_sandbox --how_many_nodes=2 mysql-8.0.22-linux-glibc2.12-x86_64.tar.gz 
installing node 1
installing node 2

node1: server-id=101
node2: server-id=102

CHANGE MASTER TO MASTER_HOST='127.0.0.1',MASTER_USER='msandbox',MASTER_PASSWORD='msandbox',MASTER_PORT=17223,MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=156;

node1: server-id=101
node2: server-id=101

Error log when replication fails because of same server id:

2021-04-15T13:11:21.898269Z 5 [System] [MY-010562] [Repl] Slave I/O thread for channel '': connected to master 'msandbox@127.0.0.1:17223',replication started in log 'mysql-bin.000005' at position 285027
2021-04-15T13:11:21.898715Z 5 [ERROR] [MY-013117] [Repl] Slave I/O for channel '': Fatal error: The slave I/O thread stops because master and slave have equal MySQL server ids; these ids must be different for replication to work (or the --replicate-same-server-id option must be used on slave but this does not always make sense; please check the manual before using it). Error_code: MY-013117
2021-04-15T13:11:21.900198Z 0 [System] [MY-010931] [Server] /home/lalit/mysql_tar/mysql/8.0.22/bin/mysqld: ready for connections. Version: '8.0.22-13'  socket: '/tmp/mysql_sandbox17224.sock'  port: 17224  Percona Server (GPL), Release 13, Revision 6f7822f.
2021-04-15T13:12:29.538896Z 11 [System] [MY-013172] [Server] Received SHUTDOWN from user msandbox. Shutting down mysqld (Version: 8.0.22-13).
2021-04-15T13:12:30.945148Z 0 [System] [MY-010910] [Server] /home/lalit/mysql_tar/mysql/8.0.22/bin/mysqld: Shutdown complete (mysqld 8.0.22-13)  Percona Server (GPL), Release 13, Revision 6f7822f.
2021-04-15T13:12:31.014461Z mysqld_safe mysqld from pid file /home/lalit/sandboxes/multi_msb_mysql-8_0_22/node2/data/mysql_sandbox17224.pid ended
2021-04-15T13:12:32.688943Z mysqld_safe Logging to '/home/lalit/sandboxes/multi_msb_mysql-8_0_22/node2/data/msandbox.err'.
2021-04-15T13:12:32.709530Z mysqld_safe Starting mysqld daemon with databases from /home/lalit/sandboxes/multi_msb_mysql-8_0_22/node2/data
2021-04-15T13:12:32.984457Z 0 [System] [MY-010116] [Server] /home/lalit/mysql_tar/mysql/8.0.22/bin/mysqld (mysqld 8.0.22-13) starting as process 12873
2021-04-15T13:12:32.995263Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2021-04-15T13:12:33.147621Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2021-04-15T13:12:34.268224Z 0 [ERROR] [MY-011292] [Server] Plugin mysqlx reported: 'Preparation of I/O interfaces failed, X Protocol won't be accessible'
2021-04-15T13:12:34.268615Z 0 [ERROR] [MY-011300] [Server] Plugin mysqlx reported: 'Setup of bind-address: '*' port: 33060 failed, `bind()` failed with error: Address already in use (98). Do you already have another mysqld server running with Mysqlx ?'
2021-04-15T13:12:34.268942Z 0 [ERROR] [MY-013597] [Server] Plugin mysqlx reported: 'Value '*' set to `Mysqlx_bind_address`, X Plugin can't bind to it. Skipping this value.'
2021-04-15T13:12:34.269280Z 0 [ERROR] [MY-011300] [Server] Plugin mysqlx reported: 'Setup of socket: '/tmp/mysqlx.sock' failed, another process with PID 31277 is using UNIX socket file'
2021-04-15T13:12:34.401782Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2021-04-15T13:12:34.403294Z 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel.
2021-04-15T13:12:34.470703Z 5 [Warning] [MY-010897] [Repl] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2021-04-15T13:12:34.473527Z 5 [System] [MY-010562] [Repl] Slave I/O thread for channel '': connected to master 'msandbox@127.0.0.1:17223',replication started in log 'mysql-bin.000005' at position 285027
2021-04-15T13:12:34.480215Z 0 [System] [MY-010931] [Server] /home/lalit/mysql_tar/mysql/8.0.22/bin/mysqld: ready for connections. Version: '8.0.22-13'  socket: '/tmp/mysql_sandbox17224.sock'  port: 17224  Percona Server (GPL), Release 13, Revision 6f7822f.
2021-04-15T13:13:35.113439Z 11 [Warning] [MY-010897] [Repl] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2021-04-15T13:13:35.115841Z 11 [System] [MY-010562] [Repl] Slave I/O thread for channel '': connected to master 'msandbox@127.0.0.1:17223',replication started in log 'mysql-bin.000005' at position 125

----------------------------------------------------------
Note how SQL thread at first stops at "in log 'mysql-bin.000005' at position 285027" but after fixing server-id, we get "starting replication in log 'mysql-bin.000005' at position 125". 

It looks like SQL Thread "forgot" its position after error.
----------------------------------------------------------

node2 [localhost] {msandbox} ((none)) > show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: 127.0.0.1
                  Master_User: msandbox
                  Master_Port: 17223
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000005
          Read_Master_Log_Pos: 285027
               Relay_Log_File: mysql-relay.000010
                Relay_Log_Pos: 285211
        Relay_Master_Log_File: mysql-bin.000005
             Slave_IO_Running: No
            Slave_SQL_Running: Yes

                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 284996
              Relay_Log_Space: 285673
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No

        Seconds_Behind_Master: 30
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 13117
                Last_IO_Error: Fatal error: The slave I/O thread stops because master and slave have equal MySQL server ids; these ids must be different for replication to work (or the --replicate-same-server-id option must be used on slave but this does not always make sense; please check the manual before using it).
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 101
                  Master_UUID: 00017223-1111-1111-1111-111111111111
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 100
          SQL_Remaining_Delay: 70
      Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed event
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 210415 18:41:21
     Last_SQL_Error_Timestamp: 

                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
       Master_public_key_path: 
        Get_master_public_key: 0
            Network_Namespace: 
1 row in set, 1 warning (0.00 sec)

********
Fixed the server id issue:  

node2 [localhost] {msandbox} ((none)) > show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: msandbox
                  Master_Port: 17223
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000005
          Read_Master_Log_Pos: 1473200
               Relay_Log_File: mysql-relay.000002
                Relay_Log_Pos: 1473415
        Relay_Master_Log_File: mysql-bin.000005
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1473200
              Relay_Log_Space: 1473620
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No

  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 101
                  Master_UUID: 00017223-1111-1111-1111-111111111111
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400

1 row in set, 1 warning (0.00 sec)

Replication in sync but master-slave has different row count for t1 table.

node1 [localhost] {msandbox} (test) > select count(*) from t1;
+----------+
| count(*) |
+----------+
|     8386 |
+----------+
1 row in set (0.00 sec)

node2 [localhost] {msandbox} (test) > select count(*) from t1;
+----------+
| count(*) |
+----------+
|     5684 |
+----------+
1 row in set (0.00 sec)

Suggested fix:
Replication should resume with correct co-coordinate after fixing server-id issue.
[15 Apr 2021 14:31] lalit Choudhary
Test  Version: MySQL 8.0.22

1. Use dbdeployer/sandbox to deploy any affected version, master-slave replication.
2. create table test.t1 (id int unsigned primary key auto_increment); on master
Write some data on master: while true; do m -BNe 'insert into test.t1 value(null)' ; done

$ make_multiple_sandbox --how_many_nodes=2 mysql-8.0.22-linux-glibc2.12-x86_64.tar.gz 
installing node 1
installing node 2

node1: server-id=101
node2: server-id=102

node1 [localhost] {msandbox} ((none)) > show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000001
         Position: 4773
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 
1 row in set (0.00 sec)

node1 [localhost] {msandbox} ((none)) > \s
--------------
/home/lalit/mysql_tar/mysql/8.0.22/bin/mysql  Ver 8.0.22 for Linux on x86_64 (MySQL Community Server - GPL)

Connection id:		10
Current database:	
Current user:		msandbox@localhost
SSL:			Not in use
Current pager:		stdout
Using outfile:		''
Using delimiter:	;
Server version:		8.0.22 MySQL Community Server - GPL

on slave: 
CHANGE MASTER TO MASTER_HOST='127.0.0.1',MASTER_USER='msandbox',MASTER_PASSWORD='msandbox',MASTER_PORT=17223,MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=4773;
start slave;

on master:
create table test.t1 (id int unsigned primary key auto_increment);

/sandboxes/multi_msb_mysql-8_0_22/node1$ while true; do ./use -BNe 'insert into test.t1 value(null)' ; done

3. Use delayed replication to have some lag on slave: 
stop slave; change master to master_delay=100; start slave;

4. Execute FLUSH LOGS; on master a few times, so SQL thread is executing different log than IO thread is writing.

5. Stop slave instance, modify config to have same server-id on slave and master, start slave instance.

node1: server-id=101
node2: server-id=101

6. Observe replication fails because of same server-id

2021-04-15T14:20:11.542164Z 5 [System] [MY-010562] [Repl] Slave I/O thread for channel '': connected to master 'msandbox@127.0.0.1:17223',replication started in log 'mysql-bin.000007' at position 262969
2021-04-15T14:20:11.542863Z 5 [ERROR] [MY-013117] [Repl] Slave I/O for channel '': Fatal error: The slave I/O thread stops because master and slave have equal MySQL server ids; these ids must be different for replication to work (or the --replicate-same-server-id option must be used on slave but this does not always make sense; please check the manual before using it). Error_code: MY-013117
2021-04-15T14:20:11.545402Z 0 [System] [MY-010931] [Server] /home/lalit/mysql_tar/mysql/8.0.22/bin/mysqld: ready for connections. Version: '8.0.22'  socket: '/tmp/mysql_sandbox17224.sock'  port: 17224  MySQL Community Server - GPL.

7.Correct server-id on slave, and restart slave instance
node1: server-id=101
node2: server-id=102

2021-04-15T14:20:59.491830Z 10 [System] [MY-013172] [Server] Received SHUTDOWN from user msandbox. Shutting down mysqld (Version: 8.0.22).
2021-04-15T14:21:00.913001Z 0 [System] [MY-010910] [Server] /home/lalit/mysql_tar/mysql/8.0.22/bin/mysqld: Shutdown complete (mysqld 8.0.22)  MySQL Community Server - GPL.
2021-04-15T14:21:01.004947Z mysqld_safe mysqld from pid file /home/lalit/sandboxes/multi_msb_mysql-8_0_22/node2/data/mysql_sandbox17224.pid ended
2021-04-15T14:21:02.659316Z mysqld_safe Logging to '/home/lalit/sandboxes/multi_msb_mysql-8_0_22/node2/data/msandbox.err'.
2021-04-15T14:21:02.680410Z mysqld_safe Starting mysqld daemon with databases from /home/lalit/sandboxes/multi_msb_mysql-8_0_22/node2/data
2021-04-15T14:21:02.945945Z 0 [System] [MY-010116] [Server] /home/lalit/mysql_tar/mysql/8.0.22/bin/mysqld (mysqld 8.0.22) starting as process 24803
2021-04-15T14:21:02.957155Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2021-04-15T14:21:03.288598Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.

2021-04-15T14:21:04.654852Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2021-04-15T14:21:04.655485Z 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel.
2021-04-15T14:21:04.764314Z 5 [Warning] [MY-010897] [Repl] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2021-04-15T14:21:04.767029Z 5 [System] [MY-010562] [Repl] Slave I/O thread for channel '': connected to master 'msandbox@127.0.0.1:17223',replication started in log 'mysql-bin.000007' at position 262969
2021-04-15T14:21:04.774672Z 0 [System] [MY-010931] [Server] /home/lalit/mysql_tar/mysql/8.0.22/bin/mysqld: ready for connections. Version: '8.0.22'  socket: '/tmp/mysql_sandbox17224.sock'  port: 17224  MySQL Community Server - GPL.

8. Stop writing on master (ctrl+C the while loop from above), and remove master delay using 
stop slave; change master to master_delay=0; start slave;

2021-04-15T14:23:23.624850Z 11 [Warning] [MY-010897] [Repl] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2021-04-15T14:23:23.625590Z 11 [System] [MY-010562] [Repl] Slave I/O thread for channel '': connected to master 'msandbox@127.0.0.1:17223',replication started in log 'mysql-bin.000007' at position 689872

9.Wait for replication to catch up, and compare number of rows on master and slave. In my test.t1 table master had 8386 and slave 5684 rows after replication had caught up

*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: msandbox
                  Master_Port: 17223
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000007
          Read_Master_Log_Pos: 1582683
               Relay_Log_File: mysql-relay.000002
                Relay_Log_Pos: 893135
        Relay_Master_Log_File: mysql-bin.000007
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1582683
              Relay_Log_Space: 893340
              Until_Condition: None

        Seconds_Behind_Master: 0

node1 [localhost] {msandbox} (test) > select count(*) from t1;
+----------+
| count(*) |
+----------+
|    16686 |
+----------+
1 row in set (0.00 sec)

node2 [localhost] {msandbox} (test) > select count(*) from t1;
+----------+
| count(*) |
+----------+
|    14403 |
+----------+
1 row in set (0.01 sec)
[21 Apr 2021 15:34] MySQL Verification Team
Hello Lalit,

Thank you for the report and feedback!

Thanks,
Umesh