Bug #89740 "Timeout reached trying to connect to MySQL Server" occurs too frequently
Submitted: 21 Feb 2018 3:59 Modified: 26 Feb 2018 22:59
Reporter: tsubasa tanaka (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Router Severity:S2 (Serious)
Version:2.1.5 OS:CentOS (7.2)
Assigned to: CPU Architecture:x86

[21 Feb 2018 3:59] tsubasa tanaka
Description:
# MySQL Router <= 2.1.4 treats `connect_timeout` as second

## Routing module uses `select` with `timeout_val` argument for timeout(struct timeval).

- https://github.com/mysql/mysql-router/blob/release-2.1.4/src/routing/src/routing.cc#L174

## `timeout_val` is set tv_sec = connect_timeout

- https://github.com/mysql/mysql-router/blob/release-2.1.4/src/routing/src/routing.cc#L153-L...

# MySQL Router = 2.1.5 treats `connect_timeout` as **millisecond**

## 2.1.5 uses `connect_non_blocking_wait` new function

- https://github.com/mysql/mysql-router/blob/release-2.1.5/src/routing/src/routing.cc#L203

## `connect_non_blocking_wait` uses `poll` function and its timeout argument is `timeout_ms` (int)

- https://github.com/mysql/mysql-router/blob/release-2.1.5/src/routing/src/routing.cc#L125

## `timeout_ms` is passed `connect_timeout` without any modification

- https://github.com/mysql/mysql-router/blob/release-2.1.5/src/routing/src/routing.cc#L203

### backtrace

```
(gdb) bt
+bt
#0  routing::SocketOperations::connect_non_blocking_wait (
    this=0x7fe2232b37d8 <routing::SocketOperations::instance()::instance_>, sock=6, timeout_ms=1)
    at /usr/src/debug/mysql-router-2.1.5/src/routing/src/routing.cc:120
#1  0x00007fe2230a87c0 in routing::SocketOperations::get_mysql_socket (
    this=0x7fe2232b37d8 <routing::SocketOperations::instance()::instance_>, addr=
      {addr = "mytest.cvmhu33tf2s1.us-east-1.rds.amazonaws.com", port = 3306, ip_family_ = mysqlrouter::TCPAddress::IPV4},
    connect_timeout=connect_timeout@entry=1, log=log@entry=true)
    at /usr/src/debug/mysql-router-2.1.5/src/routing/src/routing.cc:203
#2  0x00007fe2230a3d37 in RouteDestination::get_mysql_socket (this=this@entry=0x7fe21c000dc0, addr=
      @0x7fe213ffec20: {addr = "mytest.cvmhu33tf2s1.us-east-1.rds.amazonaws.com", port = 3306, ip_family_ = mysqlrouter::TCPAddress::IPV4}, connect_timeout=connect_timeout@entry=1, log_errors=log_errors@entry=true)
    at /usr/src/debug/mysql-router-2.1.5/src/routing/src/destination.cc:162
#3  0x00007fe2230a7cd4 in DestFirstAvailable::get_server_socket (this=0x7fe21c000dc0, connect_timeout=1,
    error=0x7fe213ffecc8) at /usr/src/debug/mysql-router-2.1.5/src/routing/src/dest_first_available.cc:43
#4  0x00007fe2230a09df in MySQLRouting::routing_select_thread (this=0x7fe221c13c20, client=5)
    at /usr/src/debug/mysql-router-2.1.5/src/routing/src/mysql_routing.cc:217
#5  0x00007fe2242612b0 in std::(anonymous namespace)::execute_native_thread_routine (__p=<optimized out>)
    at ../../../../../libstdc++-v3/src/c++11/thread.cc:84
#6  0x00007fe2244bbe25 in start_thread (arg=0x7fe213fff700) at pthread_create.c:308
#7  0x00007fe2239c934d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
```

After that, MySQL Router 2.1.5 fails to connect destination server by timeout over the Internet.

How to repeat:
1. Add routing into /etc/mysqlrouter/mysqlrouter.conf, `destinations` is over the Internet.

```
[routing:test]
bind_port= 13306
mode= read-write
destinations= xxxx:3306
```

2. Start MySQL Router and try to connect by mysql command-line client.

```
$ sudo systemctl start mysqlrouter
$ mysql -h127.0.0.1 -P13306
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 0
```

3. See /var/log/mysqlrouter/mysqlrouter.log

```
$ sudo less /var/log/mysqlrouter/mysqlrouter.log
2018-02-21 12:27:25 INFO    [7fe221c14700] [routing:test] started: listening on 127.0.0.1:13306; read-write
2018-02-21 12:27:39 WARNING [7fe213fff700] Timeout reached trying to connect to MySQL Server xxxx:3306: Connection timed out
2018-02-21 12:27:48 INFO    [7fe213fff700] [routing:test] fd=5 Pre-auth socket failure 127.0.0.1: client auth timed out
```

Suggested fix:
Get back `connect_timeout` as second value.
This means connect_non_blocking_wait function should receive 2nd argument(int timeout_ms) as second value(int timeout_sec), and pass `timeout_sec * 1000` to `poll` function.

I think this is **NOT Documentation Bug**.
This is too big imcompatibillity during 1 GA series minor upgrade.
[21 Feb 2018 10:22] tsubasa tanaka
Change synopsis, old one is "MySQL Router 2.1.5 treats connect_timeout as millisecond value"
[21 Feb 2018 10:24] tsubasa tanaka
This change has been introduced by https://github.com/mysql/mysql-router/commit/276abd2aad425e10bdac69b88e78469909417c12
[26 Feb 2018 8:15] Bogdan Kecman
Hi,

I agree this is a bug. Easily "workaroundable", but yes, that is not acceptable change for minor version upgrade. 

thanks for report
Bogdan
[26 Feb 2018 22:59] Philip Olson
Posted by developer:
 
Fixed as of the upcoming MySQL Router 2.1.6 release, and here's the changelog entry:

Router's connect_timeout configuration option was measured in milliseconds
instead of seconds in v2.1.5.

Thank you for the detailed bug report.