Bug #60427 Deadlock in case exception was thrown during isValid check
Submitted: 11 Mar 2011 3:56 Modified: 25 Mar 2013 13:28
Reporter: Niv Dalal Email Updates:
Status: Can't repeat Impact on me:
None 
Category:Connector / J Severity:S1 (Critical)
Version:5.1.13 OS:Any
Assigned to: Alexander Soklakov CPU Architecture:Any
Tags: isValid deadlock jdbc

[11 Mar 2011 3:56] Niv Dalal
Description:
It turns out, calling isValid with timeout can cause deadlock.
See below stuck of two threads.
If you'll check the stack you'll see that 
public synchronized void setTransactionIsolation, while the thread already locked by the other thread.
In addition this thread already acquired in createNewIO ther following lock for synchronized (this.mutex) 

Second thread:
public synchronized boolean isValid has started and acquired the object lock, then the thread waits for synchronized (getMutex()) which is already acquired by the thread above.

Thread [$SB$834464] (Suspended)	
	JDBC4Connection(ConnectionImpl).setTransactionIsolation(int) line: 5207	
	JDBC4Connection(ConnectionImpl).connectWithRetries(boolean, Properties) line: 2193	
	JDBC4Connection(ConnectionImpl).createNewIO(boolean) line: 2124	
	JDBC4Connection(ConnectionImpl).execSQL(StatementImpl, String, int, Buffer, int, int, boolean, String, Field[], boolean) line: 2547	
	JDBC4Connection(ConnectionImpl).execSQL(StatementImpl, String, int, Buffer, int, int, boolean, String, Field[]) line: 2509	
	StatementImpl.executeQuery(String) line: 1476	

Thread [pool-1-thread-1] (Suspended)	
	JDBC4Connection.isValid(int) line: 102	

How to repeat:
Seems like a timing issue. possibly frequent calling to isValid frequently during database communication failures 

Suggested fix:
remove synchronized from isValid method as it doesn't protect anything
[27 Jul 2011 18:43] Sveta Smirnova
Thank you for the report.

I can not repeat described behavior using generic test which calls isValid concurrently. Please provide example of how you use isValid or, better, standalone test case which we can repeat in our environment.
[27 Jul 2011 18:55] Niv Dalal
Hi
I've provided in the original case a stack trace that is relevant for the case.
I believe it happened during disconnection from the database.
You can just see the bug from reviewing the code, no need for a real test.
Anyway as this is a timing issue, I'm not sure it will be easy to reproduce as it is a timing issue, but it is easy to understand why it happens from code review.
Thank you,
Niv
[25 Mar 2013 13:28] Alexander Soklakov
Hi Niv,

As for the latest Connector/J (5.1.24) synchronization model was significantly refactored, parts of code where you have a deadlock now have common synchronization mutex.

So, please, use the latest driver. I close the bug report as "Can't repeat". Feel free to reopen it if you find the issue still exists.