Bug #12314 utf-8 encoding broken on where conditions in prepared statements
Submitted: 2 Aug 2005 10:53 Modified: 2 Feb 2006 19:40
Reporter: Michael Wyraz Email Updates:
Status: Can't repeat Impact on me:
None 
Category:Connector / J Severity:S2 (Serious)
Version:3.1.10 with mysql 4.1.12 OS:
Assigned to: CPU Architecture:Any

[2 Aug 2005 10:53] Michael Wyraz
Description:
I have a mysql database wher i connect with driver options "?useUnicode=true&characterEncoding=UTF-8". there i store data in unicode format.

The database itself was not createt with explicit character set.
So the dump says:

CREATE TABLE `search_statistics` (
  `word` varchar(255) NOT NULL default '',
  `count` int(11) default NULL,
  PRIMARY KEY  (`word`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

The table contains words with utf-8 encoding (e.g. "test-äöü").

How to repeat:
when i run the prepared statement:
"UPDATE search_statistics set count=count+1 WHERE word=?"
and do st.setString(1,"test-äöü") there's nothing updated (but the table contains this word).

when i then do
"INSERT INTO search_statistics (count,word) VALUES (1,?)"
also with st.setString(1,"test-äöü"), i cet a key violation because the word is already there.
[2 Aug 2005 11:00] Michael Wyraz
when i dump the database, change "CHARSET=latin1" to "utf8" (after each create table statement in the dump) and load the dump to the database, the problem does not occur anymore.
The problem is that the tables are created automatically, so next time i create the tables the problem will occur again.
[2 Aug 2005 13:33] Aleksey Kishkin
why don't you create (say) database with utf8 charset (that all created tables will have utf8 charset)?
[2 Aug 2005 14:21] Michael Wyraz
Ok, this should solve (workaround) the problem.
But this bug still exists.
[2 Aug 2005 17:13] Aleksey Kishkin
made some preliminary tests and was not able to reproduce it so far. 

- Did you test that query in other clients? (say) perl or C api ?

- Did you try to check it without prepare statements?
[2 Sep 2005 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[23 Jan 2006 11:20] Michael Wyraz
I have done some more tests with the latest drivers (3.1.12 and 5.0.0beta) with the same results.

My testcode prints:
normal: true
umlaut: false
umlaut (non prepared): true
umlaut found

Here's my testcode (testet with mysql4):

    public static void main(String[] args) throws Throwable
    {
        Class.forName("com.mysql.jdbc.Driver").newInstance();
        Connection con=DriverManager.getConnection(
                "jdbc:mysql://dbserver1:3307/test?useUnicode=true&characterEncoding=utf8",
                "test","test");

        con.prepareStatement("DROP TABLE IF EXISTS enctest;").execute();
        con.prepareStatement("CREATE TABLE enctest (x varchar(255)) ENGINE=InnoDB DEFAULT CHARSET=latin1").execute();

        String normalValue="test";
        String umlautValue="testäöü";

        PreparedStatement st=con.prepareStatement("INSERT INTO enctest (x) values(?)");
        st.setString(1, normalValue);
        st.executeUpdate();
        st.clearParameters();
        st.setString(1, umlautValue);
        st.executeUpdate();

        st=con.prepareStatement("SELECT * from enctest where x=?");
        st.setString(1, normalValue);
        System.err.println("normal: "+st.executeQuery().next());
        st.clearParameters();
        st.setString(1, umlautValue);
        System.err.println("umlaut: "+st.executeQuery().next());
        st=con.prepareStatement("SELECT * from enctest where x='"+umlautValue+"'");
        System.err.println("umlaut (non prepared): "+st.executeQuery().next());

        st=con.prepareStatement("SELECT * from enctest");
        ResultSet rs=st.executeQuery();
        while (rs.next())
        {
            if (rs.getString(1).equals(umlautValue)) System.err.println("umlaut found");
        }

    }
[2 Feb 2006 19:40] Mark Matthews
Works for me with 3.1.12, 5.0.0 with latest MySQL-4.1 (from nightly builds), I get:

normal: true
umlaut: true
umlaut (non prepared): true
umlaut	found

Can you test with a server more recent than MySQL-4.1.12?