Bug #82499 Seconds_behind_master is sometimes not NULL despite Slave_IO_Running=NO
Submitted: 9 Aug 2016 5:32 Modified: 14 Sep 2020 21:49
Reporter: Alexander Nemirovsky Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.6.31 OS:Linux
Assigned to: CPU Architecture:Any
Tags: Seconds_Behind_Master, show slave status

[9 Aug 2016 5:32] Alexander Nemirovsky
Description:
In a master-master replication environment one of our servers had the following slave status:

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: xxx
                  Master_User: xxx
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: 
          Read_Master_Log_Pos: 4
               Relay_Log_File: xxx-relay-bin.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File: 
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 0
              Relay_Log_Space: 151
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: Yes
           Master_SSL_CA_File: /etc/mysql/sslcerts/ca.pem
           Master_SSL_CA_Path: 
              Master_SSL_Cert: /etc/mysql/sslcerts/client-cert.pem
            Master_SSL_Cipher: 
               Master_SSL_Key: /etc/mysql/sslcerts/client-key.pem
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 402
                  Master_UUID: 7293b275-0240-11e6-a18a-06371f3aa573
             Master_Info_File: /data/mysql/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 160809 04:15:49
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 099edaa6-1b6f-11e6-85c0-0283b37476a9:1-1380,
1581d774-c657-11e5-9ade-0213de940343:1-2573,
1f942fdc-c651-11e5-9ab7-02e93de56e4d:1-3973,
224994b6-5c79-ee1a-5994-15e4bc903527:1-13198644:13198646-15451548,
24198008-b330-ee1a-6ab1-41524bb53d9b:1-19337,
2dbb1d70-e1fa-11e5-8f16-0afb93aba71d:1-192,
4944e215-9103-11e5-bf20-06a52b8e9e6b:1-556,
49be0d8e-efa5-11e5-a836-066666e70ad5:1-542,
4c45bd2e-fbb9-11e5-b6f9-062e0ddc433b:1-671,
6f41a971-b3ec-ee1a-6c3a-bc2ef8d71400:1-444728901,
7293b275-0240-11e6-a18a-06371f3aa573:1-2720,
7f511eeb-a389-11e5-b7ec-06fa8452131f:1-2859,
850ddac5-a378-11e5-b77d-06a52b8e9e6b:1:553-559,
905a4dec-4cbc-11e5-81e9-06a52b8e9e6b:1-7:1353-1406:1408-1519,
a1d04ccd-a306-11e5-b496-0e02031c7b97:1-1585:1587-8934,
a5c78a06-e1fb-11e5-8f1f-0a4412565503:1-4010,
b2f70e07-105a-ee1a-7055-403e038ce0e3:1-11753888,
b51b186b-015b-11e6-9bb7-06a2eba86f2d:1-22,
c2222a4e-4cc2-11e5-8212-06b602665ba5:1-7,
c8055c89-e9fd-11e5-8358-066666e70ad5:1-167,
d82c990b-90ed-11e5-be94-06b602665ba5:1-6220,
d8f4b274-3261-11e4-90ba-22000a9bc2b7:1-1788,
da82482d-a386-11e5-b7da-06a52b8e9e6b:1-2887,
dc1c8b79-57a7-11e5-891d-062e0ddc433b:1-3667,
dc877bce-39ae-ee1a-48b9-a8b5119100c4:1-13957485,
e3320492-34be-11e4-a023-22000a94905e:1-319311492,
e7e73333-530b-11e5-ab0e-0e02031c7b97:1-1790
                Auto_Position: 1
1 row in set (0.00 sec)

There is a replication error and Slave_IO_Running=No, but we are still seeing Seconds_behind_master as current and not NULL as expected.

How to repeat:
Was unable to reproduce by stopping IO_THREAD:

mysql> stop slave IO_THREAD;
Query OK, 0 rows affected (0.01 sec)

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: xxx
                  Master_User: xxx
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.003776
          Read_Master_Log_Pos: 468314580
               Relay_Log_File: xxx.000003
                Relay_Log_Pos: 362811
        Relay_Master_Log_File: binlog.003776
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 468314580
              Relay_Log_Space: 363200
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: Yes
           Master_SSL_CA_File: /etc/mysql/sslcerts/ca.pem
           Master_SSL_CA_Path: 
              Master_SSL_Cert: /etc/mysql/sslcerts/client-cert.pem
            Master_SSL_Cipher: 
               Master_SSL_Key: /etc/mysql/sslcerts/client-key.pem
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 402
                  Master_UUID: 7293b275-0240-11e6-a18a-06371f3aa573
             Master_Info_File: /data/mysql/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 224994b6-5c79-ee1a-5994-15e4bc903527:15726640-15726664,
6f41a971-b3ec-ee1a-6c3a-bc2ef8d71400:444745110-444745114:444745117:444745119-444745122:444745124-444745133:444745136:444745138-444745141:444745143-444745146:444745148:444745151-444745158:444745160:444745162-444745163:444745166-444745169:444745172-444745175:444745177:444745179-444745182,
b2f70e07-105a-ee1a-7055-403e038ce0e3:12012324-12012348,
dc877bce-39ae-ee1a-48b9-a8b5119100c4:13961862-13961885
            Executed_Gtid_Set: 099edaa6-1b6f-11e6-85c0-0283b37476a9:1-1380,
1581d774-c657-11e5-9ade-0213de940343:1-2573,
1f942fdc-c651-11e5-9ab7-02e93de56e4d:1-3973,
224994b6-5c79-ee1a-5994-15e4bc903527:1-15726664,
24198008-b330-ee1a-6ab1-41524bb53d9b:1-19337,
2dbb1d70-e1fa-11e5-8f16-0afb93aba71d:1-192,
4944e215-9103-11e5-bf20-06a52b8e9e6b:1-556,
49be0d8e-efa5-11e5-a836-066666e70ad5:1-542,
4c45bd2e-fbb9-11e5-b6f9-062e0ddc433b:1-671,
6f41a971-b3ec-ee1a-6c3a-bc2ef8d71400:1-444745209,
7293b275-0240-11e6-a18a-06371f3aa573:1-2790,
7f511eeb-a389-11e5-b7ec-06fa8452131f:1-2859,
850ddac5-a378-11e5-b77d-06a52b8e9e6b:1:553-559,
905a4dec-4cbc-11e5-81e9-06a52b8e9e6b:1-7:1353-1406:1408-1519,
a1d04ccd-a306-11e5-b496-0e02031c7b97:1-1585:1587-8934,
a5c78a06-e1fb-11e5-8f1f-0a4412565503:1-4010,
b2f70e07-105a-ee1a-7055-403e038ce0e3:1-12012348,
b51b186b-015b-11e6-9bb7-06a2eba86f2d:1-22,
c2222a4e-4cc2-11e5-8212-06b602665ba5:1-7,
c8055c89-e9fd-11e5-8358-066666e70ad5:1-167,
d82c990b-90ed-11e5-be94-06b602665ba5:1-6221,
d8f4b274-3261-11e4-90ba-22000a9bc2b7:1-1788,
da82482d-a386-11e5-b7da-06a52b8e9e6b:1-2887,
dc1c8b79-57a7-11e5-891d-062e0ddc433b:1-3667,
dc877bce-39ae-ee1a-48b9-a8b5119100c4:1-13961891,
e3320492-34be-11e4-a023-22000a94905e:1-319311492,
e7e73333-530b-11e5-ab0e-0e02031c7b97:1-1790
                Auto_Position: 1
1 row in set (0.00 sec)
[22 Jun 2018 17:22] MySQL Verification Team
Hi,

I cannot reproduce this! But looks like it might not be a bug after all if you actually did purge the logs. Anyhow no matter how I try I can't get this to happen on a normal master master

all best
Bogdan
[14 Sep 2020 17:22] Derek Downey
I was able to reproduce this in 5.7.30 using gtid replication.

mysql> SELECT @@version;
+-----------+
| @@version |
+-----------+
| 5.7.30    |
+-----------+
1 row in set (0.00 sec)

mysql> pager grep -E 'Seconds_Behind_Master|Slave_IO_Running|Slave_SQL_Running'
PAGER set to 'grep -E 'Seconds_Behind_Master|Slave_IO_Running|Slave_SQL_Running''
mysql> SHOW SLAVE STATUS\G
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
        Seconds_Behind_Master: 0
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
1 row in set (0.00 sec)

mysql> STOP SLAVE;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW SLAVE STATUS\G
             Slave_IO_Running: No
            Slave_SQL_Running: No
        Seconds_Behind_Master: NULL
      Slave_SQL_Running_State:
1 row in set (0.00 sec)

mysql> START SLAVE sql_thread;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW SLAVE STATUS\G
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
        Seconds_Behind_Master: 0
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
1 row in set (0.01 sec)

Steps to reproduce:
 - load docker-compose file (attached later)
 - configure replication
CHANGE MASTER TO
       MASTER_HOST = 'sqlthread_primary',
       MASTER_PORT = 3306,
       MASTER_USER = 'repl',
       MASTER_PASSWORD = 'replPass',
       MASTER_AUTO_POSITION = 1;
 - only start sql_thread:
START SLAVE sql_thread;
[14 Sep 2020 17:23] Derek Downey
Docker-compose file

Attachment: docker-compose.yml (application/x-yaml, text), 566 bytes.

[14 Sep 2020 17:23] Derek Downey
Example my.cnf

Attachment: my.cnf (application/octet-stream, text), 62 bytes.

[14 Sep 2020 21:21] MySQL Verification Team
Hi,

Sorry for missunderstanding, I assumed you are seeing value for SBH to be >0 vs NULL but you consider value of 0 to be a bug? Why do you think this is a bug?

Thanks
Bogdan
[14 Sep 2020 21:40] Derek Downey
Hi Bogdan, yes I do consider showing a value of 0 for Seconds_behind_master when the Slave_IO_Thread is not running to be a bug.

Basically, it should be NULL because it's unknown what the delay is if the I/O thread is not running.

I will say that the 5.7 documentation is misleading, as it says two different things. https://dev.mysql.com/doc/refman/5.7/en/show-slave-status.html

First: 
 "When no event is currently being processed on the replica, this value is 0."

A few paragraphs later:
 "In MySQL 5.7, this field is NULL (undefined or unknown) if the replica SQL thread is not running, or if the SQL thread has consumed all of the relay log and the replica I/O thread is not running."

I consider the second description to be the correct behavior.

Regards,
Derek

From MySQL 5.7 documentation:
[14 Sep 2020 21:49] MySQL Verification Team
Hi,

I agree! Thanks.

Verified as described.

all best
Bogdan
[9 Feb 2023 17:39] Femi Agbabiaka
log of text replicating the bug

Attachment: replication-bug.txt (text/plain), 12.39 KiB.

[9 Feb 2023 17:41] Femi Agbabiaka
Hello, I've just attached a log where I replicate this bug. You don't need to be using master-master replication, master-slave will do. When you set master_host to a non-routable address and start slave, Seconds_Behind_Master is set to 0 and never updated, even when the IO thread encounters connection errors.