Bug #65066 OutOfMemoryError
Submitted: 22 Apr 2012 20:26 Modified: 26 Apr 2012 6:22
Reporter: Bernard Zheng Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / J Severity:S5 (Performance)
Version:5.5.22 OS:Windows
Assigned to: CPU Architecture:Any

[22 Apr 2012 20:26] Bernard Zheng
Description:
The sequence :
	    float[] tfidf[] = new float[N][M];   // M =8000, N = 6000
some stuff here...
            for (int i = 0; i < N; i++) {
            	y = 0;
            	table_index = i/400;
            	
                sql = "SELECT `" + i + "` FROM `modified" + table_index + "`";
                pst = con.prepareStatement(sql);
                rs = pst.executeQuery();

            	while (rs.next()) {
            		tfidf[i][y] = rs.getFloat(1);
                    	y++;
            	}        			
            }

I have an 8000 * 6000 matrix stored in 15 tables. Each table concludes 400 columns. But the sequence above leads to OutOfMemoryError. How could I handle large size of data like this?

How to repeat:
The sequence :
	    float[] tfidf[] = new float[N][M];   // M =8000, N = 6000
some stuff here...
            for (int i = 0; i < N; i++) {
            	y = 0;
            	table_index = i/400;
            	
                sql = "SELECT `" + i + "` FROM `modified" + table_index + "`";
                pst = con.prepareStatement(sql);
                rs = pst.executeQuery();

            	while (rs.next()) {
            		tfidf[i][y] = rs.getFloat(1);
                    	y++;
            	}        			
            }
[22 Apr 2012 23:19] Bernard Zheng
I think I just solve this problem by adding a rs.close() function every loop. The performance is slow but it's working now.
[25 Apr 2012 7:03] Valeriy Kravchuk
Please, specify the exact server version used, x.y.z, and provide exact connection string used.

See http://dev.mysql.com/doc/refman/5.5/en/connector-j-reference-implementation-notes.html also:

"By default, ResultSets are completely retrieved and stored in memory. In most cases this is the most efficient way to operate, and due to the design of the MySQL network protocol is easier to implement. If you are working with ResultSets that have a large number of rows or large values, and cannot allocate heap space in your JVM for the memory required, you can tell the driver to stream the results back one row at a time.

To enable this functionality, create a Statement instance in the following manner:

stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
              java.sql.ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);

The combination of a forward-only, read-only result set, with a fetch size of Integer.MIN_VALUE serves as a signal to the driver to stream result sets row-by-row. After this, any result sets created with the statement will be retrieved row-by-row.

There are some caveats with this approach. You must read all of the rows in the result set (or close it) before you can issue any other queries on the connection, or an exception will be thrown."

I think that closing result set explicitly after every execution is a way to go in any case...
[26 Apr 2012 4:29] Bernard Zheng
Thanks a lot!
[26 Apr 2012 5:59] Valeriy Kravchuk
So, is your problem solved or you still see reasons to claim there is some bug in Connector/J in this case?
[26 Apr 2012 6:22] Bernard Zheng
Yes, what you said is very helpful. Anyway, I still just close it every loop. The result looks not bad though. Thank you