Bug #21446 Stale data read by pooled database connection
Submitted: 4 Aug 2006 13:05 Modified: 23 Sep 2006 14:13
Reporter: Phil Haigh Email Updates:
Status: Not a Bug Impact on me:
None 
Category:Connector / J Severity:S2 (Serious)
Version:4.1.13 OS:Linux (Debian Linux)
Assigned to: CPU Architecture:Any

[4 Aug 2006 13:05] Phil Haigh
Description:
I fear there may be a bug or some kind of propagation error in MySQL, possibly around the query cache or the JDBC Driver. I am posting this in the hope that somebody can cast some light on the problem. Because of its nature it has taken about three months of research and investigation to tie down the problem I am experiencing to the database/JDBC driver.

Baseline first: I'm using MySQL 4.1.13 from within Tomcat 5.5 with the MySql Connector version 3.1.12 and connection pooling. The database is a single instance on a remote machine to the Tomcat server. Everything is running on Debian Linux.

The connection pool is configured as follows:

<Resource name="jdbc/database" 
          auth="Container" 
          type="javax.sql.DataSource" 
          maxActive="200" 
          maxIdle="0" 
          maxWait="10000" 
          timeBetweenEvictionRunsMillis="30000" 
          numTestsPerEvictionRun="10" 
          minEvictableIdleTimeMillis="15000" 
          validationQuery="SELECT SERIES_ID FROM SERIES" 
          testOnBorrow="true" 
          testOnReturn="true" 
          testWhileIdle="true" 
          username="xxxxxxxx" 
          password="xxxxxxxx" 
          driverClassName="com.mysql.jdbc.Driver"
          url="jdbc:mysql://(ip-address):3306/(instance-name)" /> 

The scenario is as follows:

Step 1: The user is lookng at some data on a view page. They want to change it so they click on a 'change' link and are taken to an edit page where their data is displayed in a form.
Step 2: The user makes one or more changes and clicks Save - the form is submitted.
Step 3: A struts action is initiated. A database connection is acquired from the pool, various business rules are applied to check that the change is a valid one and the new data is committed to the database.
Step 4: Struts forwards to a second action that re-reads the just-committed data (using the same pooled connection as at step 3)
Step 5: The view page is redisplayed, showing the modified data, and on completion of the request the connection is returned to the pool.

Everything is fine up to this point.

Step 6: They then decide to go back to the edit page, so they click on 'change' again.
Step 7: A new connection is retrieved from the pool and it retrieves the data previously changed, at this point we are effectively resuming at step 2.

In most cases - say approximately 99% - the data displayed in the form second time around is the changed data that was displayed in step 5. However in the remaining cases the data that is displayed is the original data prior to the first change. This is despite the path through the code being the same for every retrieval. THe web application does not perform any caching of its own and this is not a page cache issue because I have the server-side logging to prove that when the problem occurrs (a) a change was saved, (b) that change was then re-read successfully and (c) it then retrieves stale data in a subsequent access using a new connection.

The problem is not predictable or repoducable on demand. On an average day I'll get 2-3 occurrances of this problem.

Now I know that all sounds very high level but over the last month I have instrumented the code sufficiently to ascertain the following:
- If the data is re-read using the connection it was committed with, it always returns the modified data.
- If the data is re-read a few seconds later using a different connection, it occasionally returns stale data.

So it would appear that on occasion the JDBC driver is returning stale data from a read. 
- Under what circumstances can this occurr?
- Are there any known software bugs or environment issues that sound related to this issue?
- Is there a work-around for this problem?

How to repeat:
Unable to repeat on demand - error is sporadic

Suggested fix:
None.
[4 Aug 2006 13:38] Mark Matthews
> So it would appear that on occasion the JDBC driver is returning stale data from
> a read. 

No, because it can't. It doesn't cache row data.

> - Under what circumstances can this occurr?

Are you sure you're not missing a commit() somewhere in your application, maybe due to some alternate code path being taken the 1% of the times this seems to happen? This sounds an awful lot like when developers get confused about the semantics of REPEATABLE_READ which is the default isolation level used by MySQL when using InnoDB tables. (i.e. a connection has started a transaction, and has never committed or rolled-back, and then is used later to try and "see" new data, but is actually working on the "old" snapshot that was created when the transaction was actually started).

> - Are there any known software bugs or environment issues that sound related to
> this issue?

No. Every time this has been brought up before it is due to the reasons stated above.
[4 Aug 2006 15:26] Phil Haigh
I'm positive that a commit occurs. 

I've already instrumented the code heavily using java.util.logging to demonstrate what is happening - including whether a commit takes place. It does.

Further, in addition to the changes on the 'main' table, an audit log is written. I have found during my investigations that the audit log is always written, even when the update fails. The log trace shows that the correct (modified) data is passed to the JDBC update and then successfully re-read from the database using a PreparedStatement on the same connection, it is the later read that fails.

This is the only table in the application that features frequent read/update/read activity, and it is the only area of the application exhibiting this behaviour - and all parts of the application use a common set of persistence management code.

I would expect you to be sceptical but I'm not your typical JDBC or J2EE newbie, I've been developing J2EE applications for six years and worked professionally with databases in C, Cobol, Fortran for over ten years before Java came along.

My application code is pretty bullet-proof as far as connection pool leakage goes but just in case (because I've been investigating this for some time and wanted to remove all doubt) you'll see I'm using eviction to recover leaking connections. If there were leaking connections the entire transaction would roll back, and that is not what I am seeing anyway.
[4 Aug 2006 15:28] Phil Haigh
Having just re-read my last comment I'd like to clarify it. The audit log is another table.
[17 Aug 2006 9:57] Georg Portwich
Hi, 

I have seen a similar behaviour also together with tomcat5.5 and mysql. Under some circumstances the connection will display stale data from the connection

In detail: you read data with connection A. After this you will update this data with connection B. Now you read the data again with connection A. You will not see the changes done with connection B that is still open. This happens independent from the transaction status. If you close connection B, you will see the changes. You can test this if you modify the data with a different application – for example some mysql tools. 

This is just a working theory. If could verify this or have a solution for the problem please let me know. The main problem is – you don’t see the problem with tomcat5.0.  So some already running applications may have problems to update to tomcat5.5
[27 Aug 2006 14:48] Phil Haigh
Further update on my problem.

I have been using an o/R mapping library to manage persistence. Two weeks ago I hand-crafted JDBC in the DAO code for this table and despite completely replacing the original code, the problem remains. 

I have logs that show the right data being passed into the persistence layer and bound to the JDBC update; I have logs showing the data re-read correctly and then later, a minute or so for example, re-read again with some changes missing.

So I am still convinced that the problem is not of my making.

I'm about to change hosts and so I will hold off further postings until I have the site up and running on the new host.
[28 Aug 2006 8:38] Georg Portwich
Hi,

I solved our problem. The persistence layer of the application we use does not support transaction – the commit after all updates were missing. But in the standard configuration transactions are enabled. All connection we get from a pool. If we got a connection with an open transaction we see the data from the current transaction. If we get any other connection we see the data from the database (and not the updated data). If we are running under a normal load only one connection from the pool is used and the application is running perfectly fine. Only in very rare situation we see the strange behaviour. 
Maybe you have a similar problem in your persistence layer.

Georg
[23 Sep 2006 10:29] Valeriy Kravchuk
It is not a bug in a MySQL's code. It is a problem of application(s) used.
[23 Sep 2006 14:13] Phil Haigh
I have solved my problem too... by moving hosting company. Shocking but true, nameonthe.net have a buggy infrastructure or MySQL build because as soon as I moved to weconexion.net the problem vanished. Same pooling and JDBC jars, and the same Tomcat and database version. I was seeing a lot of connection timeouts on my old host, they have also vanished...

At least I wasn't going insane, even if it wasn't a bug that can be attributed to you guys. keep up the excellent work.