Bug #87755 mysqlrouter "(HY000/2003): Can't connect to remote MySQL server for client"
Submitted: 14 Sep 2017 10:31 Modified: 6 Dec 2018 7:30
Reporter: Christian Tölle Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Router Severity:S2 (Serious)
Version:2.1.3 OS:Ubuntu
Assigned to: MySQL Verification Team CPU Architecture:Any
Tags: mysqlrouter

[14 Sep 2017 10:31] Christian Tölle
Description:
we had setup mysqlrouter as middelware to mysqlcluster.

sometimes we get the "(HY000/2003): Can't connect to remote MySQL server for client [ip]" error. If it occours, we must restart mysqlrouter to get it to work again.

How to repeat:
dont know how to repeat.
[14 Sep 2017 10:43] Christian Tölle
the config looks like this

[DEFAULT]
logging_folder = /var/log/mysqlrouter/
plugin_folder = /usr/lib/x86_64-linux-gnu/mysqlrouter
runtime_folder = /var/run/mysqlrouter
config_folder = /etc/mysqlrouter

[logger]
level = DEBUG

[routing:basic_failover]
bind_address = 192.168.0.41:3306
mode = read-write
destinations = 192.168.0.60:3306,192.168.0.61:3306
protocol=classic
[15 Sep 2017 2:13] MySQL Verification Team
Hi,

I cannot reproduce this, can you try to give us more details about the problem. Did you notice any rules when it stops working? Server load? RAM usage? Time of the day, anything in logs?

All best
Bogdan
[15 Sep 2017 8:52] Christian Tölle
from the last downtime I cant find anything in the mysqlrouter logs. After that the logfile was emtpy. Only the re(start) of the service was logged.

RAM Usage: round about 10%
CPU Load: round about 1.5 (on a 6 core(12 HT) Maschine)

Uptime of the service was round about 30 days.
We now had running mysqlrouter(before we used mysqlproxy) since 45 days and have two servers which connect (from php) trough mysqlproxy to the mysqld's.
After 29 days (05/09/2017 15:35 ) the first mysqlrouter had that problem, and one day after(06/09/2017 10:59 ) the second. 
Since the restart it had not occoured again.

Now i have changed the error log level to DEGBUG. Maybe, if one of the services crash again, I can come up with more infos from the logs..
[15 Sep 2017 9:02] Christian Tölle
the two mysqlrouter services are each on a seperate server.
[6 Nov 2017 10:21] Ralf Vogel
we hit the same issue - we'll try to get usefull debug is test system ...
[31 Jul 2018 7:05] Christian Tölle
the issue occurred again. Restart had fix it again.
But now we've saved the log file before restart.

2018-02-23 09:06:40 INFO    [7fceaab5d700] [routing:basic_failover] started: listening on 192.168.0.40:3306; read-write
2018-03-08 15:50:11 WARNING [7fce997fa700] Timeout reached trying to connect to MySQL Server 192.168.0.60:3306
2018-07-30 13:06:12 WARNING [7fcea935a700] Timeout reached trying to connect to MySQL Server 192.168.0.61:3306
2018-07-30 13:06:12 WARNING [7fcea935a700] [routing:basic_failover] Can't connect to remote MySQL server for client '192.168.0.40:3306'
2018-07-30 13:06:12 WARNING [7fcea8b59700] [routing:basic_failover] Can't connect to remote MySQL server for client '192.168.0.40:3306'
.
.
.

the last message (routing:basic_failover) fills the file every time a connection attempt was made.
The first warning on 2018-03-08 wasn't a problem, but on 2018-07-30 the database (over mysqlrouter) stopped working.

It seems, that the router  doesn't reconnect to the available servers. So that after the second warning, no available servers are left.
The 192.168.0.60 and 192.168.0.61 wasn't down at any time of the period since mysqlrouter starts.
[31 Jul 2018 7:38] Christian Tölle
i've currently checked the network for outtake or overload for the timeout on 08-30-2018. 

No outtake nor a overload(enough ressouces on both, client and server, available) was the problem for the timeout.

Also the mysql-server errorlog doesn't show a connection error at that time.
[7 Aug 2018 10:40] MySQL Verification Team
Hi,

> 2018-07-30 13:06:12 WARNING [7fcea935a700] Timeout reached trying to connect to MySQL Server 192.168.0.61:3306

this does not give us very much to go on :(

if you try manually to connect to mysql when this happens? you can connect or ?

all best
Bogdan
[14 Aug 2018 18:22] Christian Tölle
Hi,

"if you try manually to connect to mysql when this happens? you can connect or ?"

Yes, it works and directly after restart of mysqlrouter, mysqlrouter can connect too.

all best
Chris
[15 Aug 2018 8:20] Christian Tölle
The time between connection loss of first (2018-03-08 15:50:11) and second (2018-07-30 13:06:12) is big enaught to do a manualy restart. 

now we continious check the logfile for the "Timeout reached trying to connect to MySQL Server" String to know when a restart is needed.

bad, that mysqlrouter doesn't support a gracefull-restart. So we must take care of, that there are no running mysql connections at the moment of restart.
[15 Aug 2018 8:35] Christian Tölle
2018-03-08 15:50:11 WARNING [7fce997fa700] Timeout reached trying to connect to MySQL Server 192.168.0.60:3306
2018-07-30 13:06:12 WARNING [7fcea935a700] Timeout reached trying to connect to MySQL Server 192.168.0.61:3306

The main question seems, why after the first timeout(don't even know why it had occurred) on 2018-03-08 mysqlrouter doesn't try to reconnect to 192.168.0.60. It was available all the time.  
Then few months after, the 192.168.0.61 get also a timeout and that stopped the service completly.
[11 Oct 2018 8:49] Christian Tölle
The new parameter routing_strategy in v8 should fix this problem.
I will give it a try and then also post's an updates, if its working.
[3 Dec 2018 8:10] sane Gao
I also met the same problem.
Is it okay now?
[3 Dec 2018 21:08] MySQL Verification Team
Hi,

The routing_strategy might hide the problem but question is still why in your case mysql router is not reconnecting. Anyhow please try with 8.0.4+ and let us know if you can reproduce the behavior. Note that you can't use routing_strategy with mode!

thanks
Bogdan
[4 Dec 2018 1:10] sane Gao
Ok, I have upgraded to 8.0.13.
Routing_strategy parameters, I configured as first-available.
It has been running about one day, and it has not appeared again.
I will follow the feedback.
Thank you.
[6 Dec 2018 7:30] Christian Tölle
yes, I can say now, that (at least  MySQL Router v8.0.12) definitely does the expected reconnect. 

yesterday, the following timeouts had occured but the service is still working. (without a restart like in older versions)

2018-10-11 11:40:07 routing INFO [7f5570cb4700] [routing:basic_failover] started: listening on 192.168.0.41:3306
2018-12-05 13:20:02 routing WARNING [7f5570cb4700] Timeout reached trying to connect to MySQL Server 192.168.0.60:3306: Connection timed out
2018-12-05 15:13:50 routing WARNING [7f5570cb4700] Timeout reached trying to connect to MySQL Server 192.168.0.61:3306: Connection timed out

But anyhow, it`s still intersting, why the timeouts will occur. No network outtakes/overload and nothing in the mysqld logs.
[6 Dec 2018 7:50] sane Gao
Congratulations.
From 12.3 to the present, everything is running well.