Bug #88232 c/J does not rollback transaction when autoReconnect=true
Submitted: 26 Oct 2017 11:56 Modified: 27 Nov 2017 17:34
Reporter: Sveta Smirnova (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / J Severity:S3 (Non-critical)
Version:5.1.44 OS:Any
Assigned to: Filipe Silva CPU Architecture:Any

[26 Oct 2017 11:56] Sveta Smirnova
Description:
At https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-configuration-propertie... user manual says: "Should the driver try to re-establish stale and/or dead connections? If enabled the driver will throw an exception for a queries issued on a stale or dead connection, which belong to the current transaction, but will attempt reconnect before the next query issued on the connection in a new transaction."

However, if connection interrupted in the middle of transaction rollback does not happen and previously executed changes may get committed.

How to repeat:
1. Start MTR: ./mtr --start innodb &
2. Create test table: create table test(f1 varchar(255)) engine=innodb;
3. Compile and run following Java code:

import java.sql.DriverManager;
//import java.sql.SQLException;
import java.sql.Connection;
import java.sql.Statement;
import java.sql.ResultSet;

class i210021 {
	public static void main(String[] args) {
	try{
		String DB_PARAMS = "autoReconnect=true&cacheServerConfiguration=true&useLocalSessionState=true&elideSetAutoCommits=true&connectTimeout=3000&socketTimeout=60000&useSSL=false";
		Class.forName("com.mysql.jdbc.Driver"); 
		Connection conn = DriverManager.getConnection("jdbc:mysql://127.0.0.1:13000/test?" + DB_PARAMS, "root", ""); 
		Statement stmt = conn.createStatement(); 
		ResultSet rs = null;

		try { 
			stmt.execute("begin"); 
			stmt.execute("insert into test values(concat('1','::',now()))"); 
			rs = stmt.executeQuery("select concat('[',CONNECTION_ID(),'] ',now()), sleep(63)"); 
			if (rs.next()) {System.out.println(rs.getString(1));} 
		} catch (Exception e1) { 
			System.out.println("In first try-catch"); 
			System.out.println(e1); 
		}

		System.out.println("Between try-catch blocks");

		try { 
			stmt.execute("begin"); 
			stmt.execute("insert into test values(concat('2','::',now()))"); 
			stmt.execute("commit"); 
		} catch (Exception e2) { 
			System.out.println("In second try-catch"); 
			System.out.println(e2); 
		}
    } catch (Exception e) {
		System.out.println(e);
	}
	}
}
4. Check content of the table:

mysql> select * from test;
+------------------------+
| f1                     |
+------------------------+
| 1::2017-10-26 14:48:31 |
| 2::2017-10-26 14:49:34 |
+------------------------+
2 rows in set (0.00 sec)

Suggested fix:
Rollback before reconnect
[26 Oct 2017 12:00] Sveta Smirnova
Workaround: add  stmt.execute("rollback"); into the first try-catch block.
[26 Oct 2017 13:26] Alexander Soklakov
Hi Sveta,

The example looks wrong. To ensure that commit really happens only when you issue "commit" you need either:
1) disable autocommits on server
or
2) use START TRANSACTION statement, not the BEGIN one, to disable autocommit mode implicitly for a single series of statements, as described on https://dev.mysql.com/doc/refman/5.7/en/commit.html.

I.e. with c/J:

	conn.setAutoCommit(false);
	stmt.execute("insert into test values(concat('1','::',now()))");
	rs = stmt.executeQuery("select concat('[',CONNECTION_ID(),'] ',now()), sleep(63)");
	...
	conn.commit();

or

	stmt.execute("START TRANSACTION");
	stmt.execute("insert into test values(concat('1','::',now()))");
	rs = stmt.executeQuery("select concat('[',CONNECTION_ID(),'] ',now()), sleep(63)");
	...
	stmt.execute("commit");

The rollback will be done by server in this case when connection is lost.
[26 Oct 2017 18:21] Sveta Smirnova
I changed stmt.execute("begin"); with stmt.execute("START TRANSACTION"); in first try-catch block, but got same result: no rollback happened.
[27 Oct 2017 4:42] Alexander Soklakov
Could you share the stacktraces of exceptions you get in the middle?
[29 Oct 2017 21:49] Sveta Smirnova
Here it is:

sveta@delly:~/issues$ java -classpath /home/sveta/mysql_packages/mysql-connector-java-5.1.44/mysql-connector-java-5.1.44-bin.jar:.:$CLASSPATH i210021
In first try-catch
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet successfully received from the server was 60,071 milliseconds ago.  The last packet sent successfully to the server was 60,070 milliseconds ago.
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet successfully received from the server was 60,071 milliseconds ago.  The last packet sent successfully to the server was 60,070 milliseconds ago.
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
	at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:989)
	at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3559)
	at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3459)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3900)
	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527)
	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2680)
	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2483)
	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2441)
	at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1381)
	at i210021.main(i210021.java:19)
Caused by: java.net.SocketTimeoutException: Read timed out
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
	at java.net.SocketInputStream.read(SocketInputStream.java:171)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)
	at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)
	at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)
	at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3008)
	at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3469)
	... 8 more
Between try-catch blocks
[29 Oct 2017 21:50] Sveta Smirnova
Check also bug #88242 about similar issue.
[30 Oct 2017 11:52] Alexander Soklakov
Socket timeout doesn't necessarily close the connection. That's the situation you face:
1) you get the SocketTimeoutException in a first try-catch.
2) you issue the "begin" statement in a second try-catch; when autoReconnect=true, c/J first issue the PING here and if it's successful (in your case) c/J proceeds with current session; please note that "begin" on the same session also leads to implicit commit.

Test case (your one, I just reduced the timeout and added more output):

    public void testBug88232() throws Exception {
        createTable("testBug88232", "(f1 varchar(255))", "InnoDB");

        Properties props = new Properties();
        props.setProperty("autoReconnect", "true");
        props.setProperty("cacheServerConfiguration", "true");
        props.setProperty("useLocalSessionState", "true");
        props.setProperty("elideSetAutoCommits", "true");
        props.setProperty("connectTimeout", "3000");
        props.setProperty("socketTimeout", "30000");
        props.setProperty("useSSL", "false");

        Connection con = getConnectionWithProps(props);
        Statement st = con.createStatement();
        ResultSet rset = null;

        try {
            st.execute("begin");
            st.execute("insert into testBug88232 values(concat('1','::',now()))");
            rset = st.executeQuery("select concat('[',CONNECTION_ID(),'] ',now()), sleep(33)");
            if (rset.next()) {
                System.out.println(rset.getString(1));
            }
        } catch (Exception e1) {
            System.out.println("In first try-catch");
            e1.printStackTrace();

            // real close of socket connection
            //((ConnectionImpl) con).getIO().mysqlConnection.close(); // disabled to reproduce the bug report case
        }

        System.out.println("\nBetween try-catch blocks:");
        rset = this.stmt.executeQuery("select * from testBug88232"); // from the different connection
        while (rset.next()) {
            System.out.println(rset.getString(1));
        }

        try {
            st.execute("begin");

            System.out.println("\nAfter next begin:");
            rset = this.stmt.executeQuery("select * from testBug88232"); // from the different connection
            while (rset.next()) {
                System.out.println(rset.getString(1));
            }

            st.execute("insert into testBug88232 values(concat('2','::',now()))");
            st.execute("commit");
        } catch (Exception e2) {
            System.out.println("In second try-catch");
            System.out.println(e2);
        }

        System.out.println("\nFinally:");
        rset = this.stmt.executeQuery("select * from testBug88232"); // from the different connection
        while (rset.next()) {
            System.out.println(rset.getString(1));
        }

    }

Server log:
===========
2017-10-30T10:17:52.712931Z	   46 Connect	root@localhost on cjtest_5_1 using TCP/IP
2017-10-30T10:17:52.714080Z	   46 Query	/* @MYSQL_CJ_FULL_PROD_NAME@ ( Revision: @MYSQL_CJ_REVISION@ ) */SELECT  @@session.auto_increment_increment AS auto_increment_increment, @@character_set_client AS character_set_client, @@character_set_connection AS character_set_connection, @@character_set_results AS character_set_results, @@character_set_server AS character_set_server, @@collation_server AS collation_server, @@init_connect AS init_connect, @@interactive_timeout AS interactive_timeout, @@license AS license, @@lower_case_table_names AS lower_case_table_names, @@max_allowed_packet AS max_allowed_packet, @@net_buffer_length AS net_buffer_length, @@net_write_timeout AS net_write_timeout, @@have_query_cache AS have_query_cache, @@sql_mode AS sql_mode, @@system_time_zone AS system_time_zone, @@time_zone AS time_zone, @@transaction_isolation AS transaction_isolation, @@wait_timeout AS wait_timeout
2017-10-30T10:17:52.714430Z	   46 Query	SHOW WARNINGS
2017-10-30T10:17:52.715189Z	   46 Query	SET NAMES latin1
2017-10-30T10:17:52.715327Z	   46 Query	SET character_set_results = NULL
2017-10-30T10:17:52.715524Z	   46 Query	begin
2017-10-30T10:17:52.715648Z	   46 Query	insert into testBug88232 values(concat('1','::',now()))
2017-10-30T10:17:52.716877Z	   46 Query	select concat('[',CONNECTION_ID(),'] ',now()), sleep(33)
2017-10-30T10:18:22.742797Z	   45 Query	select * from testBug88232
2017-10-30T10:18:25.717228Z	   46 Query	begin
2017-10-30T10:18:25.724540Z	   45 Query	select * from testBug88232
2017-10-30T10:18:25.724894Z	   46 Query	insert into testBug88232 values(concat('2','::',now()))
2017-10-30T10:18:25.725119Z	   46 Query	commit
2017-10-30T10:18:25.725989Z	   45 Query	select * from testBug88232

Sysout:
=======
In first try-catch
com.mysql.jdbc.CommunicationsException: Communications link failure

The last packet successfully received from the server was 30,011 milliseconds ago.  The last packet sent successfully to the server was 30,011 milliseconds ago.

Between try-catch blocks:

After next begin:
1::2017-10-30 14:39:06

Finally:
1::2017-10-30 14:39:06
2::2017-10-30 14:39:39

If exclicitly close the socket in a first catch block (uncomment the ((ConnectionImpl) con).getIO().mysqlConnection.close() in the test case), then the real reconnect happens and the first transaction is rolled back:

Server log:
===========
2017-10-30T10:30:45.280990Z	   48 Connect	root@localhost on cjtest_5_1 using TCP/IP
2017-10-30T10:30:45.282126Z	   48 Query	/* @MYSQL_CJ_FULL_PROD_NAME@ ( Revision: @MYSQL_CJ_REVISION@ ) */SELECT  @@session.auto_increment_increment AS auto_increment_increment, @@character_set_client AS character_set_client, @@character_set_connection AS character_set_connection, @@character_set_results AS character_set_results, @@character_set_server AS character_set_server, @@collation_server AS collation_server, @@init_connect AS init_connect, @@interactive_timeout AS interactive_timeout, @@license AS license, @@lower_case_table_names AS lower_case_table_names, @@max_allowed_packet AS max_allowed_packet, @@net_buffer_length AS net_buffer_length, @@net_write_timeout AS net_write_timeout, @@have_query_cache AS have_query_cache, @@sql_mode AS sql_mode, @@system_time_zone AS system_time_zone, @@time_zone AS time_zone, @@transaction_isolation AS transaction_isolation, @@wait_timeout AS wait_timeout
2017-10-30T10:30:45.282447Z	   48 Query	SHOW WARNINGS
2017-10-30T10:30:45.283153Z	   48 Query	SET NAMES latin1
2017-10-30T10:30:45.283290Z	   48 Query	SET character_set_results = NULL
2017-10-30T10:30:45.283513Z	   48 Query	begin
2017-10-30T10:30:45.283636Z	   48 Query	insert into testBug88232 values(concat('1','::',now()))
2017-10-30T10:30:45.284529Z	   48 Query	select concat('[',CONNECTION_ID(),'] ',now()), sleep(33)
2017-10-30T10:31:15.318747Z	   47 Query	select * from testBug88232
2017-10-30T10:31:15.321224Z	   49 Connect	root@localhost on cjtest_5_1 using TCP/IP
2017-10-30T10:31:15.321578Z	   49 Query	SET character_set_results = NULL
2017-10-30T10:31:15.321740Z	   49 Query	begin
2017-10-30T10:31:15.321865Z	   47 Query	select * from testBug88232
2017-10-30T10:31:15.322052Z	   49 Query	insert into testBug88232 values(concat('2','::',now()))
2017-10-30T10:31:15.322262Z	   49 Query	commit
2017-10-30T10:31:15.325177Z	   47 Query	select * from testBug88232

Sysout:
=======
In first try-catch
com.mysql.jdbc.CommunicationsException: Communications link failure

The last packet successfully received from the server was 30,001 milliseconds ago.  The last packet sent successfully to the server was 30,000 milliseconds ago.

Between try-catch blocks:

After next begin:

Finally:
2::2017-10-30 14:33:13
[30 Oct 2017 12:49] Alexander Soklakov
I still think it's not a bug, maybe just a confusing option description. I've checked 15yo sources and it was doing the same since those days, i.e. issue the PING before the next query, re-establish connection in case the ping failed and restore some sesstion states from previous session. The "transaction" in this option description could rather mean an act of interaction between c/J and server than the DB transaction.

And it's up to user application to decide what to do in case of exception, c/J doesn't attempt to guess the user situation here. User should decide what to do with this exception: is it safe to proceed with current connection, would it matter if session was just frozen for some time and the next operation could succeed so that user could retry previous query. Or the session is really lost and the next operation will re-establish it.
[31 Oct 2017 2:36] DONGCHAN SUNG
Hi~

It is a complete bug issue in existing logic that does not make a new connection and makes use of existing sockets. For this reason, there is a risk that the data of another person can be seen even in the case of autocommit = 1. (Check below)
--------------------------------------------
https://bugs.mysql.com/bug.php?id=88232

The most effective solution is to create a new connection again without reusing the disconnected socket that client(jdbc) marked as failure.

I have implemented this solution as shown below. Check it please.
--------------------------------------------
https://github.com/mysql/mysql-connector-j/compare/release/5.1...gywndi:release/5.1

It is a tremendous risk of database reliability that autocommit is true and there is a hole where other people's data can be seen.

Thanks.
Chan.
[31 Oct 2017 7:23] Alexander Soklakov
Hi Dongchan,

Right, the Bug#88242 does really exist. And since both Bug#88232 and Bug#88242 are caused by the same problem with the socket left open after the socket timeout I set this bug verified.

Thanks for the "reconnectForceNew" patch, but we should rather close the socket unconditionally in case of it's timeout.
[31 Oct 2017 7:38] DONGCHAN SUNG
Hi~

"we should rather close the socket unconditionally in case of it's timeout." means.. 
You will change the code to create a new session unconditionally when socket timeout occurs?

The way not to reuse existing sockets is what I really want.  The reason I added the reconnectForceNew option is just I do not have any idea what effects on other existing system(not mine), so I thought I can specify the option so that it behaves differently.

Please make sure it is what I understand.

Thanks.
Chan.
[31 Oct 2017 7:58] Alexander Soklakov
My thought is that we should make the behaviour more predictable for user. Currently user can't know was the session reestablished or not after the socket timeout, thus it isn't clear what session data is preserved. At the same time the things like Bug#88242 could happen if socket is left open after the timeout.

Thus the proper solution could be to close the socket after timeout. With autoReconnect=true the session will be reestablished on the next query, as it was before, but now it's guaranteed that it's always a new one.

But, please, don't take it as a final solution yet. First we need to ensure that such behaviour change doesn't affect other existing functionality.
[31 Oct 2017 8:16] DONGCHAN SUNG
Actually, I was very surprised to see it work like this.
Usually, most people seem to have thought similar to me.

A connection that is forcibly disconnected by WAIT_TIMEOUT on the MySQL server will create a new connection anyway. The problem is that the session is disconnected due to a socket timeout. When I look at the source, it also works with "autocommit = 1". 

Even if someone use transactions with begin..commit syntax like me, the new session will not be a problem because it will rollback at the database level. 

Anyway, let's hope for a good decision, not just the sudden disappearance of the autoReconnect parameter.

Thanks.
Chan.
[2 Nov 2017 1:44] DONGCHAN SUNG
Hi~

Can you tell what direction this decision will go?
In fact, setting "autoReconnect" to false is the best way to do it, but it's impossible to check side effects for too many legacy programs at the moment.

So, as an alternative, we are servicing soketTimeout higher than current(60 -> 120) but I am afraid that it can be reproduced at any time. Because I respect Oracle's source-tree very much, I want to proceed according to your direction.

You may be busy ... but let me know what you think.
Thanks.
Chan.
[8 Nov 2017 14:02] Mark Matthews
Transaction control by executing commit/begin/rollback/start transaction through statements violates the JDBC API. Technically the driver should be throwing exceptions when callers attempt to do this.
[9 Nov 2017 2:39] DONGCHAN SUNG
Hi~ Mark

I'm sorry but I do not agree your opinion as a DBA.
As you know, regardless of the autoCommit parameter, it is the ability to process transactions internally in MySQL.

C ++, PHP, Perl, Python, goLang, etc. all languages only use queries provided by the mysql client. (Of course, I understand your opinion if thinking about the jdbc API side only.)

I do not want to even imagine the effect of throwing exceptions on the specific syntax provided by mysql at the current jdbc level. (maybe terrible)

And the current issue is that connector/j recycles sockets that are broken with socketTimeout, when autoReconnect is true.

I hope there will be a good solution.

Thanks.
Chan.
[27 Nov 2017 17:34] Daniel So
Posted by developer:
 
Added the following entry to the Connector/J 5.1.45 changelog: 

"When a communications exception was thrown by Connector/J after a socket timeout event, because the current transaction did not get rolled back automatically, if autoReconnect=true, the next statement execution by Connector/J might reuse the old server session and continued with the previous transaction. This might confuse the client application on the transaction status of the statements it was handling. This fix corrects the issue by forcibly closing the network resources after a communication or IO exception, causing the server to rollback the transaction."