Bug #88242 autoReconnect and socketTimeout JDBC option makes wrong order of client packet
Submitted: 27 Oct 2017 1:16 Modified: 22 Nov 2017 23:39
Reporter: Seunguck Lee Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / J Severity:S1 (Critical)
Version:5.1.41, 5.1.44 OS:Any
Assigned to: Filipe Silva CPU Architecture:Any

[27 Oct 2017 1:16] Seunguck Lee
Description:
JDBC Driver might return wrong result of query when using autoReconnect=true and socketTimeout options combination.
Let's look at the output of test case program.

$ sh ./run.sh
>> Query-1, key(1) : exception : Communications link failure. The last packet successfully received from the server was 2,013 milliseconds ago.  The last packet sent successfully to the server was 2,001 milliseconds ago.
>> Query-1, key(3) : exception : Communications link failure. The last packet successfully received from the server was 2,000 milliseconds ago.  The last packet sent successfully to the server was 1,999 milliseconds ago.
>> Query-1, key(5) : exception : Communications link failure. The last packet successfully received from the server was 2,001 milliseconds ago.  The last packet sent successfully to the server was 2,000 milliseconds ago.
>> Query-2, key(6) : exception : ResultSet is from UPDATE. No Data.
>> Query-1 : found mismatch : 7 ==> 6

As you can see, the result of the last query(SELECT 7) is 6 (NOT 7).

>>>>>>>> CORRECT RESULTSET CASE
21:05:03.730718(+0000000 us) >> SELECT 1, sleep(3)
                                  --> Exception : Communications link failure.
21:05:05.738508(+2007790 us) >> PING
21:05:06.730882(+0992374 us) << RESULT of SELECT 1
21:05:06.730946(+0000064 us) << RESLUT of PING
21:05:06.731050(+0000104 us) >> SELECT 2, sleep(1)
                                  --> This will get 2 (Correct result)
21:05:07.731565(+1000515 us) << RESULT of SELECT 2

>>>>>>>> WRONG RESULTSET CASE
21:05:11.733350(+0000000 us) >> SELECT 5, sleep(3)
                                  --> Exception : Communications link failure.
21:05:13.734162(+2000812 us) >> PING
21:05:14.733893(+0999731 us) << RESULT of SELECT 5
21:05:14.733997(+0000104 us) << RESULT of PING
21:05:14.734023(+0000026 us) >> SELECT 6, sleep(1)
                                  --> Exception : ResultSet is from UPDATE. No Data.
21:05:14.734337(+0000314 us) >> SELECT 7, sleep(3)
                                  --> This will get 6 (Mismatch)
21:05:15.734592(+1000255 us) << RESULT of SELECT 6
                                  --> Query has to sleep 3 seconds from server, but result is received only after 1 second.

No so much difference between correct and wrong case, but makes big difference because this is timing issue.
And autoReconnect option prevents mysql jdbc driver close and re-open connection after sql exception(Communications link failure).

MySQL JDBC Drvier will clear all pending input stream of socket just before send command(query).
https://github.com/mysql/mysql-connector-j/blob/release/5.1/src/com/mysql/jdbc/MysqlIO.jav...

For correct case, PING result is arrived before send query(SELECT 2), so PING result(in InputStream buffer) is cleared before sending query.
And we get the correct result for "SELECT 2".
But for incorrect case, PING result is arrived just after send query(SELECT 6), so jdbc driver can not clear PING result.
And query(SELECT 6) cause exception(ResultSet is from UPDATE. No Data.). 
After this exception, mysql server will respond query(SELECT 6) to client, and at the same time jdbc driver will send query(SELECT 7).
Finally we get the incorrect result(6) for "SELECT 7".

How to repeat:
Run attached java test-case program

Suggested fix:
I know autoReconnect option is not recommended and will be deprecated.
But manual says that "autoReconnect option may hurt transaction" mostly. And "inconsistent state" is too generic to understand the risk of this option.
And lots of peoples may thought that "inconsistent transaction state" is no problem because we use auto-commit mode.
So first of all, you should mention more detailed trouble case (like this bug case) on manual. 

"inconsistent transaction state" is risky, but this case might be much risky in point of security view(Think other user can see my private information). I think you should be more eager to announce this issue to MySQL users.
[27 Oct 2017 1:17] Seunguck Lee
Test case code

Attachment: ReqResMismatch.java (application/octet-stream, text), 1.65 KiB.

[27 Oct 2017 10:10] MySQL Verification Team
== 5.1.44 with MySQL Server 5.7.20

[umshastr@hod03]~/bugs: java -cp '.:mysql-connector-java-5.1.44/mysql-connector-java-5.1.44-bin.jar' ReqResMismatch
Fri Oct 27 11:23:29 CEST 2017 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
>> Query-1, key(1) : exception : Communications link failure

The last packet successfully received from the server was 2,013 milliseconds ago.  The last packet sent successfully to the server was 2,003 milliseconds ago.
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

.

>> Query-1, key(3) : exception : Communications link failure

The last packet successfully received from the server was 2,003 milliseconds ago.  The last packet sent successfully to the server was 2,003 milliseconds ago.
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

.

>> Query-1, key(5) : exception : Communications link failure

The last packet successfully received from the server was 2,003 milliseconds ago.  The last packet sent successfully to the server was 2,003 milliseconds ago.
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
.

>> Query-1, key(7) : exception : Communications link failure

The last packet successfully received from the server was 2,003 milliseconds ago.  The last packet sent successfully to the server was 2,003 milliseconds ago.
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
.

>> Query-1, key(9) : exception : Communications link failure

The last packet successfully received from the server was 2,002 milliseconds ago.  The last packet sent successfully to the server was 2,002 milliseconds ago.
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
.

>> Query-1, key(11) : exception : Communications link failure

The last packet successfully received from the server was 2,001 milliseconds ago.  The last packet sent successfully to the server was 2,001 milliseconds ago.
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
.
>> Query-1, key(13) : exception : Communications link failure

The last packet successfully received from the server was 2,001 milliseconds ago.  The last packet sent successfully to the server was 2,001 milliseconds ago.
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

>> Query-2, key(14) : exception : ResultSet is from UPDATE. No Data.
>> Query-1 : found mismatch : 15 ==> 14
[29 Oct 2017 5:04] MySQL Verification Team
Hello Seunguck Lee,

Thank you for the report and test case.
Observed with 5.1.44.

Thanks,
Umesh
[29 Oct 2017 9:20] DONGCHAN SUNG
Hi~

To prevent similar risks, we add new jdbc parameter "reconnectForceNew" and testing. 
>> https://github.com/mysql/mysql-connector-j/compare/release/5.1...gywndi:release/5.1

I do not think there is a reason to use a socket that the client determined was broken.
And I hope that this unintended behavior of JDBC does not cause the MySQL server to lose trust.

Check this and solve this problem please. :-)

Thanks.
Chan.
[29 Oct 2017 21:50] Sveta Smirnova
Check also bug #88232 about similar issue.
[30 Oct 2017 8:16] DONGCHAN SUNG
Hi~
I also checked MariaDB connector/j. But there's no problem.

If there is any problem include 'socketTimeout', they only close the connection socket and process a new connection at the next query. I still can not understand why the connection marked as failure is needed to exectue next query.

Thanks.
Chan.
[3 Nov 2017 2:13] Seunguck Lee
We have tested this case with MAX_EXECUTION_TIME hint and Statement.setQueryTimeout().

With both options, packet mismatch is not found during simple test case (which I have attached).
But real service case (A lot of queries and system load), both options are not helpful.
Both options are work only the query processing time is exceed specific time. But on heavy traffic server, actual query execution time is not so long, even queries are not logged in slow query log file (long_query_time=0.5). But response time on jdbc side of these queries are over 10 seconds.

Are there any other options we should check ?
[22 Nov 2017 23:39] Daniel So
Posted by developer:
 
Added the following entry to the Connector/J 5.1.45 changelog: 

"Normally, when the socketTimeout option has been set and a socket timeout occurs on the client side, the server may continue working and returning query results. At the next query executed after the timeout, Connector/J first clears the socket input stream and then sends a ping request to the server.

However, an error occurred if the autoReconnect option was set to true and, after reconnection, a new query was executed by Connector/J, and the results from the previous queries arrived before Connector/J sent its ping request to the server, in which case the old packages might be mistaken as results for the new query. This fix corrects the issue by forcibly closing the network resources after a communication or IO exception. The next statement execution recreates the IO stream if autoReconnect=true; otherwise, the connection stays closed."