Bug #50105 ReplicationDriver hangs if master are unreachable
Submitted: 6 Jan 2010 14:45 Modified: 5 Nov 2013 23:23
Reporter: Luis Figueiredo Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / J Severity:S1 (Critical)
Version:trunk OS:Linux
Assigned to: Assigned Account CPU Architecture:Any
Tags: ReplicationDriver Master

[6 Jan 2010 14:45] Luis Figueiredo
Description:
The problem had already been reported in the bug ticket http://bugs.mysql.com/bug.php?id=46644, but without bug correction.

In the my configuration I have the same problem, if the master is down, it's impossible do one new connection to the slaves.

I don't have any exception, the connection is blocked and it's impossible recovery. For the existant connections, it's no problem, but for the new connections it's impossible connect to the slaves.

Thanks

How to repeat:
Have one master and one slave, use the sample code for the ReplicationDriver and stop the master. Now it's impossible continue with one slave for the read only operations.
[1 Feb 2010 13:42] Tonci Grgin
Luis, yes, it appears to be the same problem. Let's continue in this report.
[1 Feb 2010 14:30] Luis Figueiredo
Ok, I am disponible to test after the bug correction.

Thanks Tonci
[1 Feb 2010 14:32] Tonci Grgin
Luis, checking on Bug#46644 I've noticed that the reporter was using replication driver but the connection string did not start with "jdbc:mysql:replication://". Could this be the case here too? Also, are you using socket timeouts? If not, then it's impossible for us to discover bad master in due time...
[1 Feb 2010 14:35] Luis Figueiredo
You can send me the correct timeout and the correct String? I can do the test another time with the correct string and timeout.

Thanks
[1 Feb 2010 14:38] Luis Figueiredo
The String is correct, I just need the recomended timeout to test.

Thanks,
Luis
[1 Feb 2010 14:49] Tonci Grgin
Luis, no idea... Make it, maybe, up to 30 seconds. On good network, this should be more than enough.
[1 Feb 2010 14:52] Luis Figueiredo
ReplicationDriver driver = new ReplicationDriver();

        Properties props = new Properties();

        props.put("autoReconnect", "true");

        props.put("roundRobinLoadBalance", "true");
        props.put("failOverReadOnly", "false");

        props.put("user", "luis");
        props.put("password", "password");

        props.put("connectTimeout","5");
        props.put("socketTimeout","5");

        Connection conn = null;

        try {
        conn = driver.connect("jdbc:mysql:replication://master,slave1/test", props);
        } catch (NullPointerException ex){
            System.out.println(driver.toString());
        }

Result : Connection blocked if master is down.
[1 Feb 2010 14:56] Tonci Grgin
Luis, I'll have to come up with proper test case for this, there seems to be a genuine problem described.
[2 Feb 2010 9:10] Tonci Grgin
Environment:
  o JDK 1.5.0_17 on Win2K8SE x64, running against latest c/J 5.1 trunk.
  o Remote MySQL server 5.1.31-log running on OpenSolarisx64 host on port 3306
  o Master: opensol:3308 (nothing's listening there), slave opensol:3306 (up)

Shortened test case:
	ReplicationDriver driver = new ReplicationDriver();

	Properties props = new Properties();
	props.put("autoReconnect", "true");
	props.put("roundRobinLoadBalance", "true");
	props.put("failOverReadOnly", "false");
	props.put("user", "usrName");
	props.put("password", "PWD");
	props.put("connectTimeout","5");
	props.put("socketTimeout","5");
	props.put("maxReconnects", "2");
	props.put("traceProtocol","true");
	//props.put("secondsBeforeRetryMaster","120");
	Conn = driver.connect("jdbc:mysql:replication://opensol:3308,opensol:3306/test", props);

Observations:
Due to java.lang.ArrayIndexOutOfBoundsException: -1
ConnectionImpl.java 2308
	String newHostPortPair = (String) this.hostList.get(hostIndex);
newHostPortPair is always masterIP:masterPort

which is related to Ln. 2390:
	// Check next host, it might be up...
	if (getRoundRobinLoadBalance()) {
		hostIndex = getNextRoundRobinHostIndex(getURL(),		this.hostList) - 1 /* incremented by for loop next time around */;

which is in accordance with Cristina's findings in Bug#46644.
Removing RoundRobin and
  o with props.put("autoReconnect", "false"); test exits immediately.
  o with props.put("autoReconnect", "true"); test exits immediately as it does not try slave at all.

With 
	props.put("autoReconnect", "false");
	props.put("roundRobinLoadBalance", "true");
slave is connected but there are problems here too when trying to reconnect to master which is still down.
[9 Mar 2010 10:13] Luis Figueiredo
Do I can known the bug correction date, it's already scheduled ?

Thanks,
Luis Figueiredo
[9 Mar 2010 11:11] Tonci Grgin
Luis, don't really know but it's possible that it's already fixed. Did you tried c/J from snapshots page?
[9 Mar 2010 16:59] Luis Figueiredo
Yes, I have tried with the 20100308 snapshot and I have the same problem.

Thanks
[8 Oct 2010 14:59] Mark Matthews
This functionality was entirely overhauled in 5.1.13, could you see if this issue still exists in that release (or a recent nightly snapshot)? We have tests for unreachable masters for this new functionality, so things should be working better now.
[9 Nov 2010 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[1 Aug 2011 13:13] dev thoughts
Application is not connected to slave when master is down. This issue is still exist in version 5.1.17 & treid even 5.1.13. 

Please suggest me.
[16 May 2012 5:04] praveen ingole
the problem still exists, when master mysql server is not available then , it is throwing an error message still slave is up.
[16 May 2012 5:14] praveen ingole
i am using hibernate, i am getting flowing problem in version 5.1.13.
i added properties with method names in transaction proxy, ocd get*,load* for read only calls. then also it is not going on slaves, and is disconnected while master is down. o  c        

d.
[5 Nov 2013 22:32] Todd Farmer
As of MySQL Connector/Java 5.1.27, users may specify allowMasterDownConnections=true to allow connections in replication-aware deployments to be established, even when no master hosts are available.  Such Connection objects will report they are read-only, and isMasterConnection() will return false.  The Connection will test for available master hosts when Connection.setReadOnly(true) is called, throwing a SQLException if it cannot establish a connection to a master, or switching to a master connection if the host is available.

For more information:

http://mysqlblog.fivefarmers.com/2013/11/04/multi-master-support-in-mysql-connectorjava/
[5 Nov 2013 22:36] Todd Farmer
Posted by developer:
 
Fixed in 5.1.27 with added support for multiple masters.
[5 Nov 2013 23:23] Daniel So
Added the following entry to the Connector/J 5.1.27 changelog:

In a replication-aware deployment, the replication driver hanged when the master was not reachable. As part of the new multiple-master support feature, users can now set the property allowMasterDownConnections=true to allow connections to be established even when no master hosts are available.