Bug #18086 Driver fails on non-ASCII platforms
Submitted: 8 Mar 2006 22:00 Modified: 10 Mar 2011 11:25
Reporter: Frank Griffin Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / J Severity:S3 (Non-critical)
Version:3.1.12 OS:Any (IBM z/OS)
Assigned to: Tony Bedford CPU Architecture:Any

[8 Mar 2006 22:00] Frank Griffin
Description:
The JDBC driver fails on platforms whose native encoding is not an ASCII variant.  

While the conection properties "useUniCode=true" and "characterEncoding=Cp1252" cover most of the runtime cases (I chose Cp1252 to match the hardcoded assumption that data parsed from the connection URL is "latin1"), they do *not* cover the case of the database name embedded in the URL or server error messages which get used as text for SQLExceptions.

Most of this can be handled easily by existing calls, but the code in MysqlIO.java assumes that the platform encoding is identical (or near enough) to "latin1" to make calls with expliit encoding unnecessary.  My fix was to use an explcit "Cp1252" encoding for the database name and for any server erro message returned.

How to repeat:
Any use of this driver on a z/OS system with a database name in the connection URL, e.g. "test", will reproduce this.

Suggested fix:
There is probably a much more elegant way to do this, but this patch will at least show you which areas I had to cover to get my applications to work.  All of the changes are in com.mysql.jdbc.MysqlIO.java.

750c750,763
<                 packet.writeString(database);
---
>                 //@@@FTG  packet.writeString() assumes that native
>                 //@@@FTG  platform uses a variant of latin1 (Cp1252).
>                 //@@@FTG  We'll substitiute a writeStringNoNull call
>                 //@@@FTG  with an explicit Cp1252 encoding to force
>                 //@@@FTG  conversion.
>                 //@@@FTG  commented: packet.writeString(database);
>                 try
>                  {
>                   packet.writeStringNoNull(database,"Cp1252",null,false);
>                   packet.writeByte((byte)0);
>                  }
>                  catch( Exception  eE )
>                   {}
>                 //@@@FTG
1216,1217c1229,1243
<                     packet.writeString(database);
<                 }
---
>                     //@@@FTG  packet.writeString() assumes that native
>                     //@@@FTG  platform uses a variant of latin1 (Cp1252).
>                     //@@@FTG  We'll substitiute a writeStringNoNull call
>                     //@@@FTG  with an explicit Cp1252 encoding to force
>                     //@@@FTG  conversion.
>                     //@@@FTG  commented: packet.writeString(database);
>                     try
>                      {
>                       packet.writeStringNoNull(database,"Cp1252",null,false);
>                       packet.writeByte((byte)0);
>                      }
>                      catch( Exception  eE )
>                       {}
>                     //@@@FTG
>         }
2882c2908,2918
<                 serverErrorMessage = resultPacket.readString();
---
>                 //@@@FTG  Read the message explicitly as Cp1252 in
>                 //@@@FTG  case the platform encoding doesn't match.
>                 //@@@FTG  commentd: serverErrorMessage = resultPacket.readString();
>                 serverErrorMessage = "????";
>                 try
>                  {
>                   serverErrorMessage = resultPacket.readString( "Cp1252" );
>                  }
>                  catch( Exception  eE )
>                   {}
>                 //@@@FTG
3470c3506,3519
<             packet.writeString(database);
---
>                 //@@@FTG  packet.writeString() assumes that native
>                 //@@@FTG  platform uses a variant of latin1 (Cp1252).
>                 //@@@FTG  We'll substitiute a writeStringNoNull call
>                 //@@@FTG  with an explicit Cp1252 encoding to force
>                 //@@@FTG  conversion.
>                 //@@@FTG  commented: packet.writeString(database);
>                 try
>                  {
>                   packet.writeStringNoNull(database,"Cp1252",null,false);
>                   packet.writeByte((byte)0);
>                  }
>                  catch( Exception  eE )
>                   {}
>                 //@@@FTG
3642c3691,3704
<             packet.writeString(database);
---
>                 //@@@FTG  packet.writeString() assumes that native
>                 //@@@FTG  platform uses a variant of latin1 (Cp1252).
>                 //@@@FTG  We'll substitiute a writeStringNoNull call
>                 //@@@FTG  with an explicit Cp1252 encoding to force
>                 //@@@FTG  conversion.
>                 //@@@FTG  commented: packet.writeString(database);
>                 try
>                  {
>                   packet.writeStringNoNull(database,"Cp1252",null,false);
>                   packet.writeByte((byte)0);
>                  }
>                  catch( Exception  eE )
>                   {}
>                 //@@@FTG
4267c4329,4342
<                 packet.writeString(database);
---
>                 //@@@FTG  packet.writeString() assumes that native
>                 //@@@FTG  platform uses a variant of latin1 (Cp1252).
>                 //@@@FTG  We'll substitiute a writeStringNoNull call
>                 //@@@FTG  with an explicit Cp1252 encoding to force
>                 //@@@FTG  conversion.
>                 //@@@FTG  commented: packet.writeString(database);
>                 try
>                  {
>                   packet.writeStringNoNull(database,"Cp1252",null,false);
>                   packet.writeByte((byte)0);
>                  }
>                  catch( Exception  eE )
>                   {}
>                 //@@@FTG
[8 Mar 2006 22:24] Mark Matthews
Are you fiddling with the JVM default encoding? If not, then it sounds like your JVM is broken, because String.getBytes() doesn't return ASCII, it returns the string's bytes as encoded in the platform default encoding:

http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#getBytes()

See http://java.sun.com/j2se/corejava/intl/reference/faqs/index.html for information on how the JVM chooses which encoding to use by default.

If indeed your JVM is using the correct character set, then please get back to us, and we'll debug further.
[8 Mar 2006 23:02] Frank Griffin
>Are you fiddling with the JVM default encoding? 
Nope.

>If not, then it sounds
>like your JVM is broken, because String.getBytes() doesn't return
>ASCII, it returns the string's bytes as encoded in the platform default
>encoding:

That's the problem.  The database name is extracted from the URL into a String, and ByteArrayBuffer.writeString( string ) uses getBytes() to get the bytes it writes into the packet for the server.  Those bytes are in the platform default encoding, which for z/OS is Cp1047 (an EBCDIC superset).  However, the server is expecting the bytes to be "latin1" (8859-1 or Cp1252), which is hard-coded elsewhere in MysqlIO by writing an '08' byte into to the handshake packet, so what the server sees is garbage.

Upon return of an error, readString() does the same thing, using the String constructor which assumes a byte array in the platform default encoding and converts to Unicode.  The problem is that the byte array sent by the server is Cp1252, not Cp1047.  By a happy coincidence, when the JVM reconverts the resulting String to Cp1047 for display on System.out, its Unicode -> Cp1047 undoes the readString() transform and you end up with the original Cp1252 bytes in the output.  Of course, these display as garbage on the z/OS (which thinks they're Cp1047), but at least the hex is correct ASCII.
[16 Jun 2006 17:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7772
[17 Oct 2007 12:16] Frank Griffin
Mark, I just got back to trying this again.  I assume the driver version I have (5.0.7) has your patch, because I'm not seeing the symptoms I was before.

However, I'm getting password verification failures.  I've defined user XYZ with password ABC on the server (4.0.18), and specified Host values of localhost, 127.0.0.1, actual server hostname, and %.

If I connect from the mysql client running on another linux system, it works fine.  But connecting through the connector from z/OS gives:

071017  8:12:27       2 Connect     Access denied for user: 'XYZ@192.168.100.73' (Using password: YES)

Can you please check to see if your patch ships the password properly for non-ASCII systems ?  Apparently the userid is going over OK, because it shows up properly in both the error message on z/OS and the mysql log file on linux, but I suspect that the password is getting encrypted for transmission from the EBCDIC and when the server decrypts it, he expects ASCII.

Thanks.
[8 Jan 2008 6:20] Mark Matthews
Frank,

If possible, could you try with 5.1? It's setup to use UTF-8 as the character set for authentication for MySQL-4.1 and newer, which should clear up this issue for you on EBCIDIC platforms.
[9 Feb 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[12 Feb 2008 19:28] Frank Griffin
Mark, I'm not sure from your comment whether this is expected to work or not, but I tested with the 5.1 JDBC Driver against a server running 5.0.45.

The failure now is an SQLSTATE 08S01 claiming the connection drops.  If I monitor port 3306 on the server with a sniffer, Wireshark (I assume it's wireshark doing this since this packet is from the driver) is claiming that the Login Request packet is malformed, although it displays the correct userid.

The hex of the packet is:

0000  00 11 11 75 24 1a 00 a0  8e 01 96 15 08 00 45 00   ...u$... ......E.
0010  00 54 80 f8 00 00 3d 06  ac 6d ac 1d 7f 54 c0 a8   .T....=. .m...T..
0020  64 24 08 af 0c ea 41 83  f5 3a e5 4e c6 cb 80 18   d$....A. .:.N....
0030  7f c8 2b 6d 00 00 01 01  08 0a c2 c0 9b 57 03 0c   ..+m.... .....W..
0040  b6 52 1c 00 00 01 8f 80  ff 02 fd 50 55 42 43 56   .R...... ...PUBCV
0050  53 00 78 78 78 78 78 78  78 78 00 43 56 53 53 51   S.xxxxxx xx.CVSSQ
0060  4c 00                                              L.               

As you can see, the password is given as 8 'x' bytes, but the actual password is 9 bytes long and alphanumeric.

MySQL responds with a Response Error 1043 (Bad Handshake):

0000  00 a0 8e 01 96 15 00 11  11 75 24 1a 08 00 45 08   ........ .u$...E.
0010  00 48 70 e2 40 00 40 06  79 87 c0 a8 64 24 ac 1d   .Hp.@.@. y...d$..
0020  7f 54 0c ea 08 af e5 4e  c6 cb 41 83 f5 5a 80 18   .T.....N ..A..Z..
0030  00 2e 50 79 00 00 01 01  08 0a 03 0c b6 ae c2 c0   ..Py.... ........
0040  9b 57 10 00 00 02 ff 13  04 42 61 64 20 68 61 6e   .W...... .Bad han
0050  64 73 68 61 6b 65                                  dshake
[26 Apr 2010 18:34] Yonggang Hu
Now by using connector.5.12 and passing following parameters to the connector

passwordCharacterEncoding=ASCII&characterEncoding=ASCII

the connector works well on z/OS machines.

Looks like the bug had been fixed.
[7 Feb 2011 6:40] Tonci Grgin
Thank you guys.

Closing the report and assigning to Tony to check if manual states which connect options to use with non-ASCII characters.
[10 Mar 2011 11:25] Tony Bedford
An entry has been added to the 5.1.15 changelog: 

        Using Connector/J to connect from a z/OS machine to a MySQL 
        Server failed when the database name to connect to was included 
        in the connection URL. This was because the name was sent in 
        z/OS default platform encoding, but the MySQL Server expected 
        Latin1. 

        It should be noted that when connecting from systems that do not 
        use Latin1 as the default platform encoding, the following 
        connection string options can be useful: passwordCharacterEncoding=ASCII
        and characterEncoding=ASCII.