Bug #87272 MySQL router doesn't send connection errors to read-write client
Submitted: 1 Aug 2017 13:19 Modified: 4 Aug 2017 9:52
Reporter: Nikita B Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Router Severity:S3 (Non-critical)
Version:2.1.4 OS:CentOS
Assigned to: MySQL Verification Team CPU Architecture:Any

[1 Aug 2017 13:19] Nikita B
Description:
Connected clients doesn't get any errors on primary node change event, hanging indefinitely.

How to repeat:
Create a cluster with 3 nodes, start mysql router, connect to R/W instance via socket or port, run a write query, shutdown R/W node during query execution.

Expected behaviour: 
    Client gets a connection error, reconnection is possible.
Real behaviour: 
    Client hangs, no errors for about 10-15 minutes, new clients can connect to new R/W instance.

Cluster status:
    mysql-js> var c = dba.getCluster('devCluster')
    mysql-js> c.status()
    {
        "clusterName": "devCluster", 
        "defaultReplicaSet": {
            "name": "default", 
            "primary": "192.168.122.12:3306", 
            "status": "OK", 
            "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
            "topology": {
                "192.168.122.11:3306": {
                    "address": "192.168.122.11:3306", 
                    "mode": "R/O", 
                    "readReplicas": {}, 
                    "role": "HA", 
                    "status": "ONLINE"
                }, 
                "192.168.122.12:3306": {
                    "address": "192.168.122.12:3306", 
                    "mode": "R/W", 
                    "readReplicas": {}, 
                    "role": "HA", 
                    "status": "ONLINE"
                }, 
                "192.168.122.13:3306": {
                    "address": "192.168.122.13:3306", 
                    "mode": "R/O", 
                    "readReplicas": {}, 
                    "role": "HA", 
                    "status": "ONLINE"
                }
            }
        }
    }

MySQL router log, address of R/W instance correctly changed (192.168.122.12->192.168.122.13):

    18:45:01 DEBUG    Replicaset 'default' has 3 members in metadata, 2 in status table
    18:45:01 WARNING  Member 192.168.122.12:3306 (33a131ee-7087-11e7-bbbb-5254005ea589) defined in metadata n actual replicaset
    18:45:01 DEBUG    End updating replicaset for 'default'
    18:45:01 INFO     Changes detected in cluster 'devCluster' after metadata refresh
    18:45:01 INFO     Metadata for cluster 'devCluster' has 1 replicasets:
    18:45:01 INFO     'default' (3 members, single-master)
    18:45:01 INFO         192.168.122.11:3306 / 33060 - role=HA mode=RO
    18:45:01 INFO         192.168.122.12:3306 / 33060 - role=HA mode=n/a
    18:45:01 INFO         192.168.122.13:3306 / 33060 - role=HA mode=RW
    18:45:06 INFO     Connected with metadata server running on 192.168.122.11:3306
    18:45:06 DEBUG    Updating metadata information for cluster 'devCluster'
    18:45:06 DEBUG    Updating replicaset status from GR for 'default'
    18:45:06 DEBUG    Replicaset 'default' has 3 members in metadata, 2 in status table

Clients are connecting via MySQLDB python module and mysql client:

    mysql -u root --socket /tmp/myrouter/mysql.sock -p --reconnect

    db = MySQLdb.connect(
        unix_socket="/tmp/myrouter/mysql.sock",
    #   host='127.0.0.1',
    #   port=6446,
        user="root",
        passwd="123",
        db="test",
        connect_timeout = 6)

MySQL router config:

    [DEFAULT]
    logging_folder=/tmp/myrouter/log
    runtime_folder=/tmp/myrouter/run
    data_folder=/tmp/myrouter/data
    keyring_path=/tmp/myrouter/data/keyring
    master_key_path=/tmp/myrouter/mysqlrouter.key
    
    [logger]
    level = DEBUG
    
    [metadata_cache:devCluster]
    router_id=1
    bootstrap_server_addresses=mysql://192.168.122.11:3306,mysql://192.168.122.12:3306,mysql://192.168.122.13:3306
    user=mysql_router1_3bocarc7njs2
    metadata_cluster=devCluster
    ttl=5
    
    [routing:devCluster_default_rw]
    #bind_address=0.0.0.0
    #bind_port=6446
    socket=/tmp/myrouter/mysql.sock
    destinations=metadata-cache://devCluster/default?role=PRIMARY
    mode=read-write
    protocol=classic
    max_connect_errors = 1
    destination_connect_timeout = 2
    client_connect_timeout = 2

    [routing:devCluster_default_ro]
    bind_address=0.0.0.0
    bind_port=6447
    socket=/tmp/myrouter/mysqlro.sock
    destinations=metadata-cache://devCluster/default?role=SECONDARY
    mode=read-only
    protocol=classic
    
    [routing:devCluster_default_x_rw]
    bind_address=0.0.0.0
    bind_port=64460
    socket=/tmp/myrouter/mysqlx.sock
    destinations=metadata-cache://devCluster/default?role=PRIMARY
    mode=read-write
    protocol=x
    
    [routing:devCluster_default_x_ro]
    bind_address=0.0.0.0
    bind_port=64470
    socket=/tmp/myrouter/mysqlxro.sock
    destinations=metadata-cache://devCluster/default?role=SECONDARY
    mode=read-only
    protocol=x

Software versions:

    CentOS Linux release 7.3.1611 (Core) 

    Router:
    mysqlrouter -v
    MySQL Router v2.1.4 on Linux (64-bit) (GPL community edition)

    Cluster:
    mysql -v
    Server version: 5.7.19-log MySQL Community Server (GPL)

Suggested fix:
It seems that similar bug was already referenced in https://bugs.mysql.com/bug.php?id=81548 ( https://github.com/mysql/mysql-router/pull/1 ).
According to commit description ( https://github.com/mysql/mysql-router/commit/dd0e8d2f9b041fd3d5f9cc52753e5aeec0c4410b ), fix was supposed to be applied in commit https://github.com/mysql/mysql-router/commit/cd5dc48b7bc67b14d64b923f43b1b081968fe73c , but it wasn't.
[4 Aug 2017 9:52] MySQL Verification Team
Hi,

Thanks for you report, bug is a duplicate of https://bugs.mysql.com/bug.php?id=81548

All best
Bogdan