Bug #73257 | Slave doesn't detect blocking IPTABLES to master server | ||
---|---|---|---|
Submitted: | 10 Jul 2014 11:08 | Modified: | 27 Sep 2016 13:15 |
Reporter: | Shahriyar Rzayev | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S1 (Critical) |
Version: | 5.6.19 | OS: | Linux (CentOS 6.5) |
Assigned to: | CPU Architecture: | Any |
[10 Jul 2014 11:08]
Shahriyar Rzayev
[10 Jul 2014 12:07]
MySQL Verification Team
Can you set the slave_net_timeout lower and re-test? 1 hour is the default. http://dev.mysql.com/doc/refman/5.6/en/replication-options-slave.html#option_mysqld_slave-...
[10 Jul 2014 12:16]
Shahriyar Rzayev
mysql> set @@global.slave_net_timeout = 3; Query OK, 0 rows affected, 1 warning (0,00 sec) mysql> show warnings; +---------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +---------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Warning | 1704 | The requested value for the heartbeat period exceeds the value of `slave_net_timeout' seconds. A sensible value for the period should be less than the timeout. | +---------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0,00 sec) After 3 seconds nothing happened: mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.99 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000002 Read_Master_Log_Pos: 4441 Relay_Log_File: mysql-relay-bin.000004 Relay_Log_Pos: 448 Relay_Master_Log_File: mysql-bin.000002 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 4441 Relay_Log_Space: 3963 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: 5112ae7f-081a-11e4-bd07-080027efe012 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 5112ae7f-081a-11e4-bd07-080027efe012:1-28 Executed_Gtid_Set: 5112ae7f-081a-11e4-bd07-080027efe012:1-28 Auto_Position: 1 1 row in set (0,00 sec)
[4 Aug 2014 15:34]
Hartmut Holzgraefe
The base problem is: as long as a client is only reading from a "dead" socket after connection was already established (other side lost power, cable was cut, routing problem, or firewall rule was added as is the case here) it will just wait for incoming traffic (which obviously won't happen anymore). It will only fail when running into the TCP keepalive timeout (defaults to 2 hours). The "requested value for the heartbeat period exceeds the value of `slave_net_timeout' seconds." warning probably refers to the MASTER_HEARTBEAT_PERIOD setting in CHANGE MASTER, this defaults to half the slave_net_timeout value at the time of the first CHANGE MASTER operation and needs to a lower value explicitly when changing slave_net_timeout. So you may want to retry with e.g.: CHANGE MASTER TO MASTER_HEARTBEAT_PERIOD=3; SET slave_net_timeout=6;
[27 Sep 2016 13:15]
Shahriyar Rzayev
Marking as not a bug.