Bug #46644 ReplicationDriver hangs if master/slaves are unreachable
Submitted: 11 Aug 2009 8:22 Modified: 23 Jan 2014 15:41
Reporter: Cristina Bulgaru Email Updates:
Status: Duplicate Impact on me:
None 
Category:Connector / J Severity:S2 (Serious)
Version:5.1.7 OS:Any (ReplicationDriver hangs if master/slaves are unreachable)
Assigned to: CPU Architecture:Any

[11 Aug 2009 8:22] Cristina Bulgaru
Description:
Using the Connector/J 5.1.7:
1. Master: having the following configuration for the connection:
   "autoReconnect" = “true”;
   "roundRobinLoadBalance"="true"
   If the master is unreachable (or connection parameters are incorectly set: user, password) the driver hangs up while getting a connection.
If I'm removing the property "roundRobinLoadBalance"="true", then the problem is fiexd and I'm receiving SQL Exception (unavaible to connect), so, I can treate the master down situation.

2. Slaves: having the following configuration for the connection:
   "autoReconnect" = “true”;
I removed from the beginning  the roundRobinLoadBalance=true property from configuration because, the connector 5.1.7  has hard coded the  roundRobinLoadBalance=true for slaves. 

   If all slaves are unreachable (or connection parameters are incorectly set: user, password) the driver hangs up while getting a connection.
 

I belive that the root cause is the load balancing mechanism: createNewIO method from ConnectionImpl.java:

for (; (hostIndex < this.hostListSize) && !connectionGood; hostIndex++) {
 ....
}

If all the slave hosts are unreachable then the algorithm never exits from for loop (on the hosts). The exist condition is always false:
-	because all the time the hostIndex is generated randomly and
-	because in the catch block the generated index is decremented 

How to repeat:
1. add the follwoing connection properies:
   "autoReconnect" = “true”;
   "roundRobinLoadBalance"="true"
   set the master with a invalid IP address:
   try{
    ReplicationDriver driver = new ReplicationDriver();
    Properties props = new Properties();
    props.put("autoReconnect", "true");
    // We want to load balance between the slaves	    
    props.put("roundRobinLoadBalance", "true");
    props.put("user", "user");
    props.put("password", "pwd");	
    Connection conn = driver.connect("jdbc:mysql://<wrong IP address>,127.0.0.1:3306/smb",props);
    conn.setReadOnly(false);
    //hangs up
 }catch(SQLException ex){
   ex.printStackTrace();
 }

2. remove  "roundRobinLoadBalance"="true" configuartion and sets invalid IP addresses for slaves (unreachable):
Connection conn = driver.connect("jdbc:mysql://127.0.0.1:3306, <invalid address>:3306/smb",props);
[11 Aug 2009 13:39] Mark Matthews
It's suggested that one does not use "autoReconnect=true" for this reason with replication driver, and one does not need to use "roundRobinLoadBalance=true" as well, it's implied (and slightly different) in ReplicationDriver's built-in scheme.

Can you try omitting those parameters and see what happens?
[11 Sep 2009 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[23 Dec 2009 11:30] Luis Figueiredo
I have the same problem, If the master is down, It's impossible establish one connection.

All slaves are up, but the master is down. Result : Impossible establish one connection. Without exceptions or timeouts.

Thanks
[23 Dec 2009 14:22] Luis Figueiredo
I have the same problem with the version 5.1.10.

If I have one problem with the master (unreachable), no new connections are possible.
[1 Feb 2010 14:33] Tonci Grgin
Continued in Bug#50105.
[23 Jan 2014 15:41] Alexander Soklakov
Please use the solution described in Bug#50105.