MySQL Bugs: #18148: setFetchSize() does not adhear to jdbc interface contract

Bug #18148	setFetchSize() does not adhear to jdbc interface contract
Submitted:	11 Mar 2006 3:35	Modified:	11 Mar 2006 4:42
Reporter:	Jeramyn Feucht	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	Connector / J	Severity:	S1 (Critical)
Version:	Tested in 3.1,3.2,5.0	OS:	Windows (Windows XP (Irrelevant))
Assigned to:		CPU Architecture:	Any

Description:
When reading large resultsets jdbc driver reads the entire result set into memory instead of streaming the results if the fetch size is set to a reasonable number like 100.  Will stream properly if fetch size is set to Integer.MIN_VALUE.  Since we have a DB independant persistance layer in our application we cannot change our setfetchsize implementation to use Integer.MIN_VALUE.  This bug prevents us from using MYSql for installations with our customers.

How to repeat:
Execute the following code with jvm parameter -Xmx16m.  Can reproduce with out jvm parameter but need a larger dataset.  Can change comments to see how correct behavior works when fetch size is set to Integer.MIN_VALUE

package test;

import java.io.*;
import java.sql.*;
import java.util.Arrays;

/**
 * test bug where setReadOnly fails with ORA-01453 when issued after another statement that fails bug appears to be fixed with 10.2.0.1.0 oracle thin drivers
 */
public class TestMySqlStreamBug {

    Connection c = null;
    {
        // modify as appropriate
        Class.forName("com.mysql.jdbc.Driver");
        c = DriverManager.getConnection("jdbc:mysql://localhost:3306/empire", "empire", "empire");
        c.setAutoCommit(false);
    }

    public TestMySqlStreamBug() throws SQLException, ClassNotFoundException {
    }

    public void testIt() throws Exception {
        Statement s = c.createStatement();
        try {
            s.execute("drop table TestCLOB");
        } catch (Exception e) {
            // may not exist
        }
        s.execute("create table TestCLOB (RecordId varchar(32) PRIMARY KEY,data LONGTEXT)");
        runTest();

        s.close();
        c.close();
    }

    private void runTest() throws SQLException, IOException {
        int test = 0;
        Statement s1 = c.createStatement();
        //uncomment to exhibit correct behavior
//        s1.setFetchSize(Integer.MIN_VALUE);
        
        //uncomment to work
        s1.setFetchSize(100);
        
        PreparedStatement p2 = c.prepareStatement("insert into TestCLOB (RecordId,data) values (?,?)");
        char[] chrs = new char[100000];
        Arrays.fill(chrs, 'a');
        String str1 = new String(chrs);
        while (test < 500) {
            String id = "test" + test;
            id = id.intern();
            p2.setCharacterStream(2, new StringReader(str1), str1.length());
            p2.setString(1, id);
            p2.execute();
            if(test%100==0) {
                System.out.println(test + " rows inserted.");
                c.commit();
            }
            test++;
        }
        System.out.println("before execute query");
        ResultSet result = s1.executeQuery("select data from TestCLOB");
        System.out.println("query completed");
        while(result.next()) {
            result.getString(1);
        }
        result.close();
        s1.close();
        System.out.println("Tested...");
    }

    private void dumpInfo() {
        try {
            DatabaseMetaData meta = c.getMetaData();

            // gets driver info:
            System.out.println("\n=============\nDatabase Product Name is ... " + meta.getDatabaseProductName());
            System.out.println("\nDatabase Product Version is " + meta.getDatabaseProductVersion());
            System.out.println("\n=============\nJDBC Driver Name is ........ " + meta.getDriverName());
            System.out.println("\nJDBC Driver Version is ..... " + meta.getDriverVersion());
            System.out.println("\nJDBC URL " + meta.getURL());
            System.out.println("\n=============");

            // get environment info:
            java.util.Properties props = System.getProperties();
            java.util.Enumeration propNames = props.propertyNames();

            StringBuffer path = new StringBuffer("\t" + props.getProperty("java.library.path"));
            int idx = 0;

            while (idx < path.length()) {
                idx = path.toString().indexOf(";", idx);

                if (idx == -1)
                    break;
                path.replace(idx, ++idx, "\n\t");

                idx++;
            }

            StringBuffer classpath = new StringBuffer("\t" + props.getProperty("java.class.path"));
            int idx2 = 0;

            while (idx2 < classpath.length()) {
                idx2 = classpath.toString().indexOf(";", idx2);

                if (idx2 == -1)
                    break;
                classpath.replace(idx2, ++idx2, "\n\t");

                idx2++;
            }

            System.out.println("java runtime Version: \t" + props.getProperty("java.runtime.version") + "\n");
            System.out.println("\n=============");
            System.out.println("java vm Version: \t" + props.getProperty("java.vm.version") + "\n");
            System.out.println("\n=============");
            System.out.println("java vm name: \t" + props.getProperty("java.vm.name") + "\n");
            System.out.println("\n=============");
            System.out.println("java library path: \n" + path + "\n");
            System.out.println("\n=============");
            System.out.println("java classpath: " + classpath + "\n");

        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) throws Exception {
        TestMySqlStreamBug bug = new TestMySqlStreamBug();
        bug.dumpInfo();
        bug.testIt();
    }
}

Suggested fix:
Change implementation of the Statement so that it will adhear to the fetch size to improve streaming performance and prevent OOME errors.  

Simplest workaround would be to change the implementation of the createStreamingResultSet() so the default behavior is to stream the results

one note related to bug 15321 but i could not change the status of that bug to open.

The MySQL server doesn't have the ability before version 5.0 to even make this possible. It is not a limitation of the JDBC driver, there just isn't a way to make it work before MySQL-5.0 because of limitations in the network protocol and no support for "Cursors" for result sets.

Streaming result sets only allow one open result set at a time on a given connection, which also doesn't adhere to the JDBC API (but which is also a trait shared with other vendors, some who open whole new connections for each new statement, and thus aren't capable of running transactions in this mode). That's why it's not the default for MySQL. In general, prior to MySQL-5.0, the JDBC driver is set "out of the box" for OLTP type applications, where result sets are smaller, and the aim is to reduce lock contention in the server.

There is experimental support for fetching rows in batches (see the documentation for more details) when using Connector/J 5.0.1 (not yet released, see nightly snapshots at http://downloads.mysql.com/snapshots.php#connectorj)

When using Connector/J 5.0.1 along with more recent builds of the MySQL server, you can add "useCursorFetch=true" to your JDBC url parameters, and the driver will fetch rows in batches of size setFetchSize() as defined in the JDBC API.

One could also argue that the behavior _does_ follow the JDBC API, quoting from the APIDOCS for Statement.setFetchSize():

"Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed. The number of rows specified affects only result sets created using this statement. If the value specified is zero, then the hint is ignored. The default value is zero."

It's only a _hint_. Driver vendors are allowed to ignore hints. The very reason that the wording is there is because there are quite a few vendors who can not adhere to this "contract" in all situations.

I'm marking this as "Not a bug", because the functionality is there in C/J 5.0 when using the MySQL-5.0 server. If you find the functionality lacking, please reopen this bug with more details.

I just wanted to point out that the JDBC Statement.setFetchSize contract also states:

Throws: SQLException - if a database access error occurs, or the condition 0 <= rows <= this.getMaxRows() is not satisfied.

So the Integer.MIN_VALUE (a negative number) workaround is actually not in-spec because it depends on the implementation to violate this expectation.

Hi,
I know I am very late to add comment to this bug. But please understand the fact, that you completely put all of us off by quoting that, the feature to configure fetchSize is experimental and went ahead to contradict that you are closing the bug since it is available in 5.0 connector. I am using pentaho pdi(one of your technology partners), which in its latest version does not believe in this theory and still uses the technique of streaming. When I am trying to move around large datasets(please read in millions), I am expecting a predictable way to configure the fetch size(leave it to me to figure how much heap I want for this). Please do not expect me to use a feature that you declared as experimental. I think, at least it would make sense for you to come back and declare that you are no more considering the feature implementation as experimental. I can ask my architect to open a ticket, since we have a platinum support, if you think that could help you to take this on a serious note.

Hi,
I know I am very late to add comment to this bug. But please understand the fact, that you completely put all of us off by quoting that, the feature to configure fetchSize is experimental and went ahead to contradict that you are closing the bug since it is available in 5.0 connector. I am using pentaho pdi(one of your technology partners), which in its latest version does not believe in this theory and still uses the technique of streaming. When I am trying to move around large datasets(please read in millions), I am expecting a predictable way to configure the fetch size(leave it to me to figure how much heap I want for this). Please do not expect me to use a feature that you declared as experimental. I think, at least it would make sense for you to come back and declare that you are no more considering the feature implementation as experimental. I can ask my architect to open a ticket, since we have a platinum support, if you think that could help you to take this on a serious note.

The feature is experimental on the server, so if you're going to open a support issue, open it on the server. The JDBC driver has full support for it, but the caveats exist on the server side, which is why it's not enabled by default.

It's not enabled be default, because many frameworks, and application code (mistakenly) call setFetchSize() whether they need to or not.

> So the Integer.MIN_VALUE (a negative number) workaround is actually not in-spec > because it
> depends on the implementation to violate this expectation.

There are spec-compliant methods on com.mysql.jdbc.Statement to enable/disable streaming results (enable/disableStreamingResults()). On the other hand, actually using streaming results creates some non-JDBC compliant behavior, in that you can't have more than one in-progress Statement per-connection.

Matthew,
Thanks a lot. If it is going to be a enhancement from server side, I presume its going to be a long drawn process. To me Issue is still not solved, I have to see the jdbc interface and the server in unison. Do you have any performance numbers that can be shared with me, when the connector handles large dataset in such fashion. Would you be able to tell me an alternative methodology to improve the performance of result streaming ?