MySQL Bugs: #35086: connection is not retiried with another backend sometimes

Bug #35086	connection is not retiried with another backend sometimes
Submitted:	5 Mar 2008 17:16	Modified:	3 Jun 2009 19:23
Reporter:	Alexey Kachalov	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Proxy: Core	Severity:	S3 (Non-critical)
Version:	0.6.1	OS:	FreeBSD (FreeBSD 6.2 amd64)
Assigned to:		CPU Architecture:	Any
Tags:	Contribution

Description:
It seems to me that mysql-proxy tries to set state of unavailable backend server to UP every 10 seconds. And if there's a failure on next connection to this backend server, the backend server is marked as down again for next 10 seconds.

Suppose the backend was just marked as UP after 10 seconds of unavailability.
Then next connection is performed.
If connection failure is detected at network-mysqld-proxy.c:3855 (v0.6.1), the connection is retried with another backend server.
But if it's detected at network-mysqld-proxy.c:3707, the connection is not retried and reported as failed to client.

I think mysql-proxy should retry connection even if it's detected on line 3707.

How to repeat:
Run mysql-proxy with two backends. One available, one non-existing or unavailable.

/usr/local/sbin/mysql-proxy --admin-address=:4046 --proxy-address=:4045 --proxy-backend-addresses=:3307 --proxy-backend-addresses=:3306

then try to connect to proxy port :4045 several times in parallell.
Sooner or later you wil get connection failure, even though second backend at :3306 is available.

This is a simplified test case actually. I met this error in different conditions. I wrote lua script to try  backend servers in fixed order always starting from first server (default round-robin doesn't suit me).

Suggested fix:
--- src/network-mysqld-proxy.c.orig     Fri Feb 29 19:37:57 2008
+++ src/network-mysqld-proxy.c  Sat Dec  8 02:11:10 2007
@@ -3707,6 +3707,20 @@
                        g_critical("%s.%d: connect(%s) failed: %s",
                                        __FILE__, __LINE__,
                                        con->server->addr.str, strerror(so_error));
+
+                       if (st->backend->state != BACKEND_STATE_DOWN) {
+                               g_critical("%s.%d: marking %s as down",
+                                               __FILE__, __LINE__, con->server->addr.str);
+
+                               st->backend->state = BACKEND_STATE_DOWN;
+                               g_get_current_time(&(st->backend->state_since));
+
+                               network_socket_free(con->server);
+                               con->server = NULL;
+
+                               return RET_ERROR_RETRY;
+                       }
+
                        return RET_ERROR;
                }

...synopsys updated

Thank you for the report.

Verified almost as described except version 0.6.1 and last trunk fail to connect every time and version 0.6.0 just report error sometimes.

I cannot reproduce this using the mysql proxy 0.7.1 or later