MySQL Bugs: #736: Unexpected exception thrown inside RowDataDynamic.nextRecord()

Bug #736	Unexpected exception thrown inside RowDataDynamic.nextRecord()
Submitted:	26 Jun 2003 14:38	Modified:	29 Jun 2003 13:57
Reporter:	Dan Kroymann	Email Updates:
Status:	Closed	Impact on me:	None
Category:	Connector / J	Severity:	S2 (Serious)
Version:	3.0.8	OS:	Linux (Linux)
Assigned to:	Mark Matthews	CPU Architecture:	Any

Description:
While processing the data in a streaming ResultSet, I occasionally get an unexpected SQLException thrown by ResultSet.next().  The error appears to be generated from within MysqlIO.nextRow(int), however it is caught in RowDataDynamic.nextRecord() and is rethrown as a SQLException.  Unfortunately, the original error has no message so when I finally am able to catch the SQLException, all I can see is this:

java.sql.SQLException: Error trying to fetch row:null
        at com.mysql.jdbc.RowDataDynamic.nextRecord(RowDataDynamic.java:280)
        at com.mysql.jdbc.RowDataDynamic.next(RowDataDynamic.java:244)
        at com.mysql.jdbc.ResultSet.next(ResultSet.java:2418)

I have verified that the problem occurs when connecting to both MySQL Server 4.0.13 and 4.1.0

How to repeat:
Create a statement that is "forward only" and "concurrent read only" and has the fetch size set to Integer.MIN_VALUE.  Now that the statement is setup to generate streaming result sets, make a simply query to return a few hundred thousand rows (ie: SELECT * FROM table_name).  Then while looping through calls to ResultSet.next() and processing each row of data, at some point prior to reaching the end of the result set, ResultSet.next() throws this SQLException.

The problem is somewhat sporadic, and is not deterministically repeatable.  I currently have caused the problem to occur numerous times while connecting to a database on the localhost, but I have not yet tried to recreate the problem from a remote host.

I've fixed the error message for 3.0.9 and 3.1.1 to be more specific on what might cause this exception. Your e-mail outside of this ticket indicated that the problem itself was your program logic.

I get the same exception. Both with mysql 3.x and 4.x

<no.fast.connector.framework.AdapterException: JDBC error>no.fast.connector.framework.AdapterException: JDBC error
       at no.fast.connector.jdbc.JDBCAdapter.crawl(JDBCAdapter.java:164)
       at no.fast.connector.framework.Adapter.run(Adapter.java:18)
       at java.lang.Thread.run(Thread.java:534)
Caused by: java.sql.SQLException: Error trying to fetch row:null
       at com.mysql.jdbc.RowDataDynamic.nextRecord(RowDataDynamic.java:280)
       at com.mysql.jdbc.RowDataDynamic.next(RowDataDynamic.java:244)
       at com.mysql.jdbc.ResultSet.next(ResultSet.java:2432)
       at no.fast.connector.jdbc.JDBCAdapter.crawl(JDBCAdapter.java:131) 

I get this error at seemingly random places. It can take 5 millon rows and it can take 200.000 rows. Probably less as well.

Not sure I understand the final comment, is this problem tracked down as problem in the application?

Not sure that I understand the final comment.

The problem is that the program was locking certain tables, causing the network connection to 'timeout' at the _network_ level before MySQL could unlock the tables and send the rows.

And the plot thickens.

I added a try/catch clause in MysqlIO.nextRow which printed the stack trace.

Now I got the following exception:
java.io.EOFException
       at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:1082)
       at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:1343)
       at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:712)
       at com.mysql.jdbc.RowDataDynamic.nextRecord(RowDataDynamic.java:271)
       at com.mysql.jdbc.RowDataDynamic.next(RowDataDynamic.java:244)
       at com.mysql.jdbc.ResultSet.next(ResultSet.java:2432)
       at no.fast.connector.jdbc.JDBCAdapter.crawl(JDBCAdapter.java:131)
       at no.fast.connector.framework.Adapter.run(Adapter.java:18)
       at java.lang.Thread.run(Thread.java:536) 

I've now added some debug printout inside of readFull to see what arguments that method is called with.

Will update once I know more about what happens inside there (can take a few hours to trigger the bug so...)

In my case, I'm extracting a lot of data for synchronizing with our search engine.

There is constant network traffic, and I cannot really see any reason that there should be a timeout there.

What I do see however, is that I don't have any problems if I just extract data from MySQL.

However, if I feed data with xmlrpc to our search engine on the other side, I do loose connection at times.

Indeed, it might seems like this often occurs at the same time as small delays in the traffic, but I never see more than 2-3 second periods without traffic (I've been looking at it with tcpdump).

I've tried to do this using IBM's javavm, and everything seems fine using it.
I've not been able to reproduce the problem there even after 10's of millions of records extracted from mysql and sent to the search engine.

Using the Sun javavm, I've never managed more than 4.5 million records.

I would suspect some threding issue in the Sun Javavm rather than a just a timeout issue.

But yes, I cannot find anything wrong in the java connector or on the mysql server side.

Its pretty clear from the tcpdumps that it is the javavm that is closing the connection however (and this is a close, it sends a FIN to the server), and this can happen in the middle of a packet stream.

The problem that I experienced was basically a deadlock situation.  I was streaming a massive result set and processing each record one at a time.  The result set was a join of a lot of different tables, and unwittingly, I was trying to update one of those tables during the processing of certain rows.  That table was locked because of the streaming result set, and when the update statement would come in, it would have to block waiting for the lock.  End result: deadlock.  After a certain amount of inactivity on the part of the streaming result set, it would eventually have its locked dissolved so that the update statement could execute (it appeared to the server as though the streaming select statement had hung because it didn't request any data for over two minutes). Then when the code returned to the top of the loop and tried to get the next streaming result set, everything blew up because it's locks on the tables had been screwed up.

My fix was to leave the offending table out of the join clause in the streaming select statement.  Instead, I dynamically grab the appropriate data from that table for each row that is processed from the streaming statement.  This also fixes the potential dirty reads that could have occurred as a side effect of updating one of the tables that I was streaming my results from.