Bug #10491 Server returns data as charset binary SHOW CREATE TABLE or SELECT FROM I_S
Submitted: 9 May 2005 19:19 Modified: 31 Jul 2007 18:41
Reporter: Tetsuro Ikeda Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S2 (Serious)
Version:4.1 and up OS:Any
Assigned to: Alexander Nozdrin
Tags: bfsm_2007_04_05, bfsm_2007_06_21

[9 May 2005 19:19] Tetsuro Ikeda
Description:
hi!

When I was writing a test code which use Japanese character as table name,
and When I was using "SHOW CREATE TABLE..." statement for
checking whether the table name is correctly stored or not, 
I got the broken String from Connector/J.

At first, I guessed this is a connector/J's bug, but it isn't.

I searched what is the problem, and finally I found the MySQL
server returned the ResultSet'data as charset "binary".

Charset "binary" from MySQL is mapped to charset "US-ASCII" for Java
in the Connector/J

At first, table name is sent to MySQL server by Connector/J
as Japanese charsets (I tried UTF-8, EUC_JP, Shift_JIS ...),
but MySQL server returns to Connector/J as binary(US-ASCII).

Yeah, returned table name is broken because the string is converted
to US-ASCII which do not have Japanese letters.

So, I used "information_schema" database instead of using "SHOW
CREATE TABLE..." in the current "ConnectorJapaneseTest.java" because if
ResultSet's data from information_schema is returned as charset utf8,
and  utf8 is converted to UTF-8 by Connector/J that doesn't break table
name string.

By adding System.out.println() to Field.java's constructor and
MysqlIO.java's unpackField method, I noticed this.
Server returns the charset of data by using charset number "63".
This number is used in CharsetMapping's INDEX_TO_CHARSET array.
63 means charset "binary".

And also number 63 or 33 can be found in $MYSQLSERVER/share/charsets/index.xml

MySQL manual says any metadata is returned as charset utf8.
<http://dev.mysql.com/doc/mysql/en/charset-metadata.html>

So, I think this may be server bug...

But difficulty is that I have no way to explain that server returns data
as charset binary, except using System.out.println on Field.java and
MysqlIO.java

How to repeat:
Check the charset number from Server when you use "SHOW CREATE TABLE..." by anything you can. I used Connector/J and checked the custom debug log (System.out.println if you use Java) on Field.java's constructor or MysqlIO.java's unpackField method.

The below is my application test code:

String url = "jdbc:mysql://localhost/test";
Properties props = new Properties();
props.put("user", "");
props.put("password", "");
props.put("useUnicode", "true");
props.put("characterEncoding", "Shift_JIS");

Class.forName("com.mysql.jdbc.Driver").newInstance();
Connection conn = DriverManager.getConnection(url, props);
Statement stmt = conn.createStatement();
stmt.executeUpdate("CREATE TABLE t1 (c1 int)");
ResultSet rs = stmt.executeQuery("SHOW CREATE TABLE t1");
rs.next();
String returnedTableName = rs.getString("Table");

The below is my custom debug code in MysqlIO.java unpackField method:
[Connector/J ver3.1.8 nightlybuild, line592]

short charSetNumber = (short) packet.readInt();
System.out.println("unpackField:charSetNumber is :" + charSetNumber);

The below is debug message from debug code:

unpackField:charSetNumber is :63

Suggested fix:
The meta data such as result set of "SHOW CREATE TABLE.." should be returned not as binary but as utf8.
[9 May 2005 20:33] Tetsuro Ikeda
The result data from server is actually utf8 format, but field's meta data of charsest is set to binary. So, Connector/J convert the utf8 data to binary(US-ASCII) format, and table name string become broken.
[25 Jun 2005 6:22] Vasily Kishkin
Could you please explane me what "packet" is in your test code ? Where is "packet" defined ? What is type ?
[26 Jul 2005 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[15 Mar 2007 15:35] Tonci Grgin
Hi Tetsuro and thanks for your report. 

Verified as described on 5.0.38BK on WinXP Pro SP2 localhost.
[20 Mar 2007 9:48] Tonci Grgin
This problem represents grave issue for connectors. They can't work around it, at least not reliably. Ad hoc user queries like this do not allow connector to distinguish between "SHOW CREATE TABLE", where it should treat BIANRY as UTF8, and "SELECT varbinary_col FROM some_table", where it really should be binary... I'm requesting escalation.
[27 Mar 2007 9:30] Tonci Grgin
Bug#27138 was marked as duplicate of this one.
[3 Apr 2007 13:13] Tonci Grgin
Bug#27409 was marked as duplicate of this one.
[3 Apr 2007 23:27] Jared S
Tonci,

Between the 3 other bugs you have marked duplicate to this bug, could you please post 'lowest common denominator' test case if possible for 5.0.38.

I would like to better understand the bug at hand, although I have improved my SQL code to avoid BINARY data being returned by NET CONNECTOR say 5.0.7.
[4 Apr 2007 10:23] Tonci Grgin
Setting Bug#27633 as duplicate of this one...
[4 Apr 2007 10:29] Tonci Grgin
Jared, I have done all that I could by verifying this report. It is beyond my duty and knowledge to mess with server code. 
You can see example of data returned by connectors in Bug#27633. Basically connectors have no way of telling wether data should be treated like BINARY or UTF8 (for example). Thus MyODBC in Bug#27633 returned 0xApril for invoice month expecting array of bytes...
[9 May 2007 15:02] Tonci Grgin
Bug#28306 was marked duplicate of this one.
[23 May 2007 13:01] Tonci Grgin
Bug#28420 was marked as duplicate of this one.
[27 Jun 2007 13:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29712

ChangeSet@1.2503, 2007-06-27 17:10:17+04:00, anozdrin@ibm. +3 -0
  Fix for BUG#10491: Server returns data as charset binary
  SHOW CREATE TABLE or SELECT FROM I_S.
  
  Actually, the bug discovers two problems:
    - the original query is not preserved properly. This is the problem
      of BUG#16291;
    - the resultset of SHOW CREATE TABLE statement is binary.
  
  This patch fixes the second problem for the 5.0.
  
  Both problems will be fixed in 5.1.
[28 Jun 2007 9:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29828

ChangeSet@1.2505, 2007-06-28 13:24:52+04:00, anozdrin@ibm. +3 -0
  Fix for BUG#10491: Server returns data as charset binary
  SHOW CREATE TABLE or SELECT FROM I_S.
  
  Actually, the bug discovers two problems:
    - the original query is not preserved properly. This is the problem
      of BUG#16291;
    - the resultset of SHOW CREATE TABLE statement is binary.
  
  This patch fixes the second problem for the 5.0.
  
  Both problems will be fixed in 5.1.
[28 Jun 2007 9:28] Alexander Nozdrin
Pushed into 5.0-runtime.
[29 Jun 2007 13:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29951

ChangeSet@1.2506, 2007-06-29 17:37:17+04:00, anozdrin@ibm. +6 -0
  Follow up to the patch for the BUG#10491.
[2 Jul 2007 15:03] Bugs System
Pushed into 5.0.46
[2 Jul 2007 15:06] Bugs System
Pushed into 5.1.21-beta
[5 Jul 2007 13:23] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented bugfixes in 5.0.46 and 5.1.21 changelogs.
[11 Jul 2007 18:36] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/30732

ChangeSet@1.2528, 2007-07-11 22:35:05+04:00, anozdrin@ibm. +6 -0
  Fix for 5.1 for BUG#10491: Server returns data as charset binary
  SHOW CREATE TABLE or SELECT FROM I_S.
  
  This is the last patch for this bug, which depends on the big
  CS patch and was pending.
  
  The problem was that SHOW CREATE statements returned original
  queries in the binary character set. That could cause the query
  to be unreadable.
  
  The fix is to use original character_set_client when sending
  the original query to the client. In order to preserve the query
  in mysqldump, binary character set should be set when issuing SHOW
  CREATE statement.
[11 Jul 2007 19:53] Konstantin Osipov
Approved over email.
[12 Jul 2007 8:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/30759

ChangeSet@1.2528, 2007-07-12 12:49:39+04:00, anozdrin@ibm. +6 -0
  Fix for 5.1 for BUG#10491: Server returns data as charset binary
  SHOW CREATE TABLE or SELECT FROM I_S.
  
  This is the last patch for this bug, which depends on the big
  CS patch and was pending.
  
  The problem was that SHOW CREATE statements returned original
  queries in the binary character set. That could cause the query
  to be unreadable.
  
  The fix is to use original character_set_client when sending
  the original query to the client. In order to preserve the query
  in mysqldump, 'binary' character set results should be set when
  issuing SHOW CREATE statement. If either source or destination
  character set is 'binary' , no conversion is performed.
  The idea is that since the source character set is no longer
  'binary', we fix the destination character set to still produce
  valid dumps.
[12 Jul 2007 8:52] Alexander Nozdrin
Pushed into 5.1-runtime.
[17 Jul 2007 15:31] Bugs System
Pushed into 5.1.21-beta
[25 Jul 2007 15:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31549

ChangeSet@1.2481, 2007-07-25 19:46:50+04:00, anozdrin@ibm. +6 -0
  Patch inspired by BUG#10491: Server returns data as charset
  binary SHOW CREATE TABLE or SELECT FROM I_S.
  
  The problem is that mysqldump generates incorrect dump for a table
  with non-ASCII column name if the mysqldump's character set is
  ASCII.
  
  The fix is to:
    1. Switch character_set_client for the mysqldump's connection
    to binary before issuing SHOW CREATE TABLE statement in order
    to avoid conversion.
    
    2. Dump switch character_set_client statements to UTF8 and back
    for CREATE TABLE statement.
[1 Aug 2007 23:26] Bugs System
Pushed into 5.0.48
[1 Aug 2007 23:27] Bugs System
Pushed into 5.1.21-beta
[12 Dec 2008 8:28] Tonci Grgin
Bug#41418 was marked as duplicate of this one.