MySQL Bugs: #63353: Problem using jdbc:mysql:replication driver with multiple slaves

Bug #63353	Problem using jdbc:mysql:replication driver with multiple slaves
Submitted:	21 Nov 2011 11:11	Modified:	22 Nov 2011 11:14
Reporter:	Alessandro none	Email Updates:
Status:	Closed	Impact on me:	None
Category:	Connector / J	Severity:	S1 (Critical)
Version:	5.1.18	OS:	Linux
Assigned to:		CPU Architecture:	Any
Tags:	replication, slaves

Description:
When using jdbc:mysql:replication jdbc driver with one master and multiple slaves to load balance read only queries, everything works as expected until one slave goes down. When that happens, jdbc will issue queries at a really slow rate (like one query per second) that makes access to the db practically impossible.
When the slave comes up again, everything start working again.

How to repeat:
Use the following snippet of code to test read only query balancing between slave nodes.
The code issues a "Select" instruction followed by a "commit" to force jdbc to rebalance connections.
We found this to be the only way to have instructions balanced. If there are others, please advise.

While the code is running, stop one slave. Program will almost hang. Then bring the slave back up again.

public class TestMysql {
    final static int POOLSIZE= 1;
   
    public static void main(String[] args) {
        Properties props = new Properties();
        Connection conn = null;
        Statement stmt = null;
        ResultSet rs = null;
       
         Connection[] connections = new Connection[POOLSIZE];
       
        try {
            // Defining props
            props.put("user", "test");
            props.put("password", "testpwd");
           
            // Open the connection
            String url = "jdbc:mysql:replication://master,slave1,slave2/testdb";           
            Class.forName ("com.mysql.jdbc.ReplicationDriver").newInstance();
            for (int i =0; i<POOLSIZE; i++) {
                Connection con = DriverManager.getConnection (url, props);
                con.setAutoCommit(false);
                con.setReadOnly(true);
                connections[i] = con;
            }
            // Test nodes balancing
            for (int i = 0; i<100000; i++) {
                try {
                     conn = connections[i%POOLSIZE];
                     stmt = conn.createStatement();
                     rs = stmt.executeQuery("select * from tabella");
                     while(rs.next()) {
                        System.out.println(rs.getString("campo"));
                    }
                    conn.commit();
                   
                    try {if(rs != null) {rs.close();} } catch (Exception ex) {}
                    try {if(stmt != null) {stmt.close(); } } catch (Exception ex) {}
                }
                catch(Exception e) {
                    e.printStackTrace();                   
                }
            }
        }
        catch (Exception e) {
            e.printStackTrace();
        }
        finally {
            for(int i=0;i<poolsize;i++) {
                try {
                    connections[i].close();
                }
                catch (SQLException ex) {
                    ex.printStackTrace();
                }
            }
         }
    }
}

The driver doesn't enable the blacklist by default. Could you see if adding "loadBalanceBlacklistTimeout=x" where X is the number of milliseconds a host should be left for dead fixes this for you? We may want to consider picking a default value and using that instead as it would be a more expected mode of operation.

That's it. Setting loadBalanceBlacklistTimeout=5000 fixed the setup for us.

Thanks a lot.

Another reason for your test to seem slow or hanging could be the default value of 'retriesAllDown' parameter. By default it is 120 and the driver could be trying 120 times to reconnect with a 250ms sleep between checks.