Bug #96309 Connector/J:MultiHost in loadbalance may lead to a TPS reduction during a quick
Submitted: 24 Jul 8:46 Modified: 17 Sep 1:47
Reporter: qingshun luo Email Updates:
Status: Open Impact on me:
None 
Category:Connector / J Severity:S5 (Performance)
Version:5-1-31,even 5-1-46 OS:Any
Assigned to: CPU Architecture:Any
Tags: loadbalance

[24 Jul 8:46] qingshun luo
Description:
During use of LoadbalanceProxyConnection with multi-host, a quick switch may cause normal sever still kept in blacklist during loadBalanceBlacklistTimeout.

As to TPS reduction situation, consider a simple example:

Two hosts, located at two seperately machine marked as A and B serve for jdbc loadbalance.
Machine A shutdown, witch is added into blacklist. When machine A recovers, machine B shutdown in no time.
Now machine A can serve normally, but it's still kept in blacklist with long delay until loadBalanceBlacklistTimeout reaches, leading to low TPS.

Test result describe:

```
loadBalanceBlacklistTimeout=60000

Time          Network Action             MySQL Cluster Status        TPS&Client Status               Blacklist   
...           -                          mysql#A=Ok,mysql#B=Ok       TPS=5, Normal                   None
time0         Drop mysql#A               mysql#A=Drop,mysql#B=Ok     TPS=5, Normal                   mysql#A
time0+5s      Recover mysql#A            mysql#A=Ok,mysql#B=Ok       TPS=5, Normal                   mysql#A
time0+10s     Drop mysql#B               mysql#A=Ok,mysql#B=Drop     TPS=0 or 1, Abnormal            mysql#A,mysql#B
...
time0+59s     Drop mysql#B               mysql#A=Ok,mysql#B=Drop     TPS=0 or 1, Abnormal            mysql#A,mysql#B
time0+60s~    Drop mysql#B               mysql#A=Ok,mysql#B=Drop     TPS=5, Normal                   mysql#B
```

From [time0+10s] to  [time0+59s], client status is abnormal, mysql#A kept in blacklist can serve normally indeed.

How to repeat:
1. Test code:
    
    ```
    private static String driver = "com.mysql.jdbc.Driver";
    private static String url = "jdbc:mysql:loadbalance://172.21.34.87:40991,172.21.34.87:40992/schema_test_jdbc?useSSL=false&connectTimeout=3000&socketTimeout=3000&loadBalanceBlacklistTimeout=60000";
    private static String user = "test";
    private static String password = "test";

    private static SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
    private static Connection c;

    static void executeSimpleTransaction(String name, Connection c){
        try { 
            c.setAutoCommit(false); 
            Statement s = c.createStatement();
            String row_res;
            ResultSet rs = s.executeQuery("SELECT SLEEP(0.2)"); 
            while(rs.next()){
                row_res = rs.getString("SLEEP(0.2)");
            }
            rs.close();
            c.commit(); 
        } catch (SQLException e) {
            System.out.println(name + " execute Query fail at " + sdf.format(new Date()));
            e.printStackTrace(); 
        } 
    } 

     static Connection getNewConnection() throws SQLException, ClassNotFoundException {
        Class.forName("com.mysql.jdbc.Driver");
        return DriverManager.getConnection(url, user, password);
     }
     
     public static void main(String[] args) throws Exception {
        long tx_count = 0;
        long pre_tx_count = 0;

        long current_time = new Date().getTime();
        long previous_time = current_time;

        int re_connect = -1;
        int i = 0;

        String thd_name="TEST-JDBC ";
        while(true) {
            try {
                current_time = new Date().getTime();
                if (current_time - previous_time >= 3000) {
                    System.out.println(thd_name + sdf.format(new Date()) + " transaction TPS:" + (tx_count-pre_tx_count)/((current_time - previous_time)/1000)); /* Print current TPS */
                    pre_tx_count = tx_count;
                    previous_time = current_time;
                }
                if (re_connect != 0) {
                    c = getNewConnection();
                    i = i + 1;
                    re_connect = 0;
                }
                executeSimpleTransaction(thd_name, c);
                tx_count = tx_count+1;
            } catch(SQLException e) {
                System.out.println(thd_name + " connect fail at " + sdf.format(new Date()));
                e.printStackTrace();
            } catch (Exception e) {
                System.out.println(thd_name + " throw exception at " + sdf.format(new Date()));
                e.printStackTrace();
            }
        }
     }
     ```
     
3. Simulate step:

In test program machine, simulate server machine shutdown with root user:

root@dev3498:~> iptables -F
root@dev3498:~> date; iptables -A OUTPUT -p tcp -d 172.21.34.87 --dport 40991 -j DROP
Wed Jul 24 13:47:48 CST 2019
root@dev3498:~> date; iptables -D OUTPUT -p tcp -d 172.21.34.87 --dport 40991 -j DROP  
Wed Jul 24 13:47:52 CST 2019
root@dev3498:~> date; iptables -A OUTPUT -p tcp -d 172.21.34.87 --dport 40992 -j DROP 
Wed Jul 24 13:47:55 CST 2019

Then a significant response latency and TPS reduction will be observed, sustaining about 60s, which depends on loadBalanceBlacklistTimeout setting.

4. Corresponding log info:

TEST-JDBC 2019-07-24 13:47:45 transaction TPS:5
TEST-JDBC  execute Query fail at 2019-07-24 13:47:52
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during commit(). Transaction resolution unknown.
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
        at com.mysql.jdbc.Util.getInstance(Util.java:408)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:918)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:897)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:886)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:860)
        at com.mysql.jdbc.ConnectionImpl.commit(ConnectionImpl.java:1559)
        at com.mysql.jdbc.MultiHostMySQLConnection.commit(MultiHostMySQLConnection.java:129)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.mysql.jdbc.LoadBalancedConnectionProxy.invokeMore(LoadBalancedConnectionProxy.java:495)
        at com.mysql.jdbc.MultiHostConnectionProxy.invoke(MultiHostConnectionProxy.java:457)
        at com.sun.proxy.$Proxy0.commit(Unknown Source)
        at com.companyname.bank.App.executeSimpleTransaction(App.java:149)
        at com.companyname.bank.App.main(App.java:121)
TEST-JDBC 2019-07-24 13:47:52 transaction TPS:2
TEST-JDBC 2019-07-24 13:47:55 transaction TPS:5
TEST-JDBC  execute Query fail at 2019-07-24 01:48:01
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during commit(). Transaction resolution unknown.
......
TEST-JDBC 2019-07-24 13:48:01 transaction TPS:0
TEST-JDBC 2019-07-24 13:48:05 transaction TPS:1
TEST-JDBC 2019-07-24 13:48:08 transaction TPS:0
TEST-JDBC 2019-07-24 13:48:13 transaction TPS:2
TEST-JDBC 2019-07-24 13:48:17 transaction TPS:1
TEST-JDBC 2019-07-24 13:48:20 transaction TPS:1
TEST-JDBC 2019-07-24 13:48:23 transaction TPS:0
TEST-JDBC 2019-07-24 13:48:27 transaction TPS:0
TEST-JDBC 2019-07-24 13:48:30 transaction TPS:1
TEST-JDBC 2019-07-24 13:48:34 transaction TPS:1
TEST-JDBC 2019-07-24 13:48:37 transaction TPS:0
TEST-JDBC 2019-07-24 13:48:40 transaction TPS:0
TEST-JDBC 2019-07-24 13:48:44 transaction TPS:0
TEST-JDBC 2019-07-24 13:48:47 transaction TPS:1
TEST-JDBC 2019-07-24 13:48:50 transaction TPS:0
TEST-JDBC 2019-07-24 13:48:53 transaction TPS:5

Suggested fix:
When host serves normally, remove host from Blacklist.

```
diff --git a/src/com/mysql/jdbc/LoadBalancedConnectionProxy.java b/src/com/mysql/jdbc/LoadBalancedConnectionProxy.java
index 80ec7696..84cd130f 100644
--- a/src/com/mysql/jdbc/LoadBalancedConnectionProxy.java
+++ b/src/com/mysql/jdbc/LoadBalancedConnectionProxy.java
@@ -371,6 +371,8 @@ public class LoadBalancedConnectionProxy extends MultiHostConnectionProxy implem
     public synchronized ConnectionImpl createConnectionForHost(String hostPortSpec) throws SQLException {
         ConnectionImpl conn = super.createConnectionForHost(hostPortSpec);
 
+        this.removeFromGlobalBlacklist(hostPortSpec);
+
         this.liveConnections.put(hostPortSpec, conn);
         this.connectionsToHostsMap.put(conn, hostPortSpec);
 
@@ -639,6 +641,20 @@ public class LoadBalancedConnectionProxy extends MultiHostConnectionProxy implem
         addToGlobalBlacklist(host, System.currentTimeMillis() + this.globalBlacklistTimeout);
     }
 
+    /**
+     * Removes a host from the blacklist.
+     * 
+     * @param host
+     *            The host need remove from the blacklist.
+     */
+    public void removeFromGlobalBlacklist(String host) {
+        if (isGlobalBlacklistEnabled()) {
+            synchronized (globalBlacklist) {
+                globalBlacklist.remove(host);
+            }
+        }
+    }
+
     /**
      * Checks if host blacklist management was enabled.
      */
```
[13 Sep 8:03] Alexander Soklakov
Hi qingshun,

Did you try Connector/J 8.0?
[17 Sep 1:46] qingshun luo
Test result with Connector/J 8.0.17

Attachment: test-result.log (application/octet-stream, text), 9.60 KiB.

[17 Sep 1:47] qingshun luo
Thank you for the report.
I try to test with Connector/J 8.0.17, similar result is obtained.