Description:
We switched to SSL replication for all our master-slave set up recently and noticed that slave IO stalled constantly. Test case is below:
1. Enabled SSL replication, "show slave status" looks like below:
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: lva1-db52.corp.linkedin.com
Master_User: repl_user
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.010795
Read_Master_Log_Pos: 376269715
Relay_Log_File: relay-log.000020
Relay_Log_Pos: 267780355
Relay_Master_Log_File: mysql-bin.010795
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 376269472
Relay_Log_Space: 376270208
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: Yes
Master_SSL_CA_File: /export/home/mysql/.certs/ca.pem
Master_SSL_CA_Path: /export/home/mysql/.certs
Master_SSL_Cert: /export/home/mysql/.certs/client-cert.pem
Master_SSL_Cipher:
Master_SSL_Key: /export/home/mysql/.certs/client-key.pem
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 500520
Master_UUID: 27d85e2b-2dc6-11e4-b2ad-f40f1b1ebd2a
Master_Info_File: /mnt/u001/mysql/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
1 row in set (0.00 sec)
Read_Master_Log_Pos is progressing, sometime later, we received below error in the log:
"2018-05-17 14:01:12 29286 [ERROR] Error reading packet from server: Lost connection to MySQL server during query (server_errno=2013)
2018-05-17 14:01:12 29286 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.010795' at position 898212416
2018-05-17 14:01:12 29286 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2018-05-17 14:03:15 29286 [ERROR] Error reading packet from server: Lost connection to MySQL server during query (server_errno=2013)
2018-05-17 14:03:15 29286 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.010796' at position 490592399
2018-05-17 14:03:15 29286 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2018-05-17 14:44:57 29286 [ERROR] Error reading packet from server: Lost connection to MySQL server during query (server_errno=2013)
2018-05-17 14:44:57 29286 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.010797' at position 996393951
2018-05-17 14:44:57 29286 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2018-05-17 14:49:10 29286 [ERROR] Error reading packet from server: Lost connection to MySQL server during query (server_errno=2013)
2018-05-17 14:49:10 29286 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.010798' at position 704009950
2018-05-17 14:49:10 29286 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2018-05-17 14:54:33 29286 [ERROR] Error reading packet from server: Lost connection to MySQL server during query (server_errno=2013)
2018-05-17 14:54:33 29286 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.010799' at position 426197684
2018-05-17 14:54:33 29286 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2018-05-17 14:56:28 29286 [ERROR] Error reading packet from server: Lost connection to MySQL server during query (server_errno=2013)
2018-05-17 14:56:28 29286 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.010799' at position 924245333
2018-05-17 14:56:28 29286 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2018-05-17 14:58:50 29286 [ERROR] Error reading packet from server: Lost connection to MySQL server during query (server_errno=2013)
2018-05-17 14:58:50 29286 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.010800' at position 527064397
2018-05-17 14:58:50 29286 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information."
The replication works fine(no stall) without SSL.
How to repeat:
stop/start replication, the issue happen again after sometime