| Bug #73257 | Slave doesn't detect blocking IPTABLES to master server | ||
|---|---|---|---|
| Submitted: | 10 Jul 2014 11:08 | Modified: | 27 Sep 2016 13:15 |
| Reporter: | Shahriyar Rzayev | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server: Replication | Severity: | S1 (Critical) |
| Version: | 5.6.19 | OS: | Linux (CentOS 6.5) |
| Assigned to: | CPU Architecture: | Any | |
[10 Jul 2014 11:08]
Shahriyar Rzayev
[10 Jul 2014 12:07]
MySQL Verification Team
Can you set the slave_net_timeout lower and re-test? 1 hour is the default. http://dev.mysql.com/doc/refman/5.6/en/replication-options-slave.html#option_mysqld_slave-...
[10 Jul 2014 12:16]
Shahriyar Rzayev
mysql> set @@global.slave_net_timeout = 3;
Query OK, 0 rows affected, 1 warning (0,00 sec)
mysql> show warnings;
+---------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1704 | The requested value for the heartbeat period exceeds the value of `slave_net_timeout' seconds. A sensible value for the period should be less than the timeout. |
+---------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0,00 sec)
After 3 seconds nothing happened:
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.99
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 4441
Relay_Log_File: mysql-relay-bin.000004
Relay_Log_Pos: 448
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 4441
Relay_Log_Space: 3963
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID: 5112ae7f-081a-11e4-bd07-080027efe012
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 5112ae7f-081a-11e4-bd07-080027efe012:1-28
Executed_Gtid_Set: 5112ae7f-081a-11e4-bd07-080027efe012:1-28
Auto_Position: 1
1 row in set (0,00 sec)
[4 Aug 2014 15:34]
Hartmut Holzgraefe
The base problem is: as long as a client is only reading from a "dead" socket after connection was already established (other side lost power, cable was cut, routing problem, or firewall rule was added as is the case here) it will just wait for incoming traffic (which obviously won't happen anymore). It will only fail when running into the TCP keepalive timeout (defaults to 2 hours). The "requested value for the heartbeat period exceeds the value of `slave_net_timeout' seconds." warning probably refers to the MASTER_HEARTBEAT_PERIOD setting in CHANGE MASTER, this defaults to half the slave_net_timeout value at the time of the first CHANGE MASTER operation and needs to a lower value explicitly when changing slave_net_timeout. So you may want to retry with e.g.: CHANGE MASTER TO MASTER_HEARTBEAT_PERIOD=3; SET slave_net_timeout=6;
[27 Sep 2016 13:15]
Shahriyar Rzayev
Marking as not a bug.
