Bug #18086 Driver fails on non-ASCII platforms
Submitted: 8 Mar 2006 23:00 Modified: 12 Feb 2008 20:28
Reporter: Frank Griffin
Status: Open
Category:Connector/J Severity:S3 (Non-critical)
Version:3.1.12 OS:Any (IBM z/OS)
Assigned to: Mark Matthews Target Version:
Triage: D3 (Medium)

[8 Mar 2006 23:00] Frank Griffin
Description:
The JDBC driver fails on platforms whose native encoding is not an ASCII variant.  

While the conection properties "useUniCode=true" and "characterEncoding=Cp1252" cover most
of the runtime cases (I chose Cp1252 to match the hardcoded assumption that data parsed
from the connection URL is "latin1"), they do *not* cover the case of the database name
embedded in the URL or server error messages which get used as text for SQLExceptions.

Most of this can be handled easily by existing calls, but the code in MysqlIO.java assumes
that the platform encoding is identical (or near enough) to "latin1" to make calls with
expliit encoding unnecessary.  My fix was to use an explcit "Cp1252" encoding for the
database name and for any server erro message returned.

How to repeat:
Any use of this driver on a z/OS system with a database name in the connection URL, e.g.
"test", will reproduce this.

Suggested fix:
There is probably a much more elegant way to do this, but this patch will at least show
you which areas I had to cover to get my applications to work.  All of the changes are in
com.mysql.jdbc.MysqlIO.java.

750c750,763
<                 packet.writeString(database);
---
>                 //@@@FTG  packet.writeString() assumes that native
>                 //@@@FTG  platform uses a variant of latin1 (Cp1252).
>                 //@@@FTG  We'll substitiute a writeStringNoNull call
>                 //@@@FTG  with an explicit Cp1252 encoding to force
>                 //@@@FTG  conversion.
>                 //@@@FTG  commented: packet.writeString(database);
>                 try
>                  {
>                   packet.writeStringNoNull(database,"Cp1252",null,false);
>                   packet.writeByte((byte)0);
>                  }
>                  catch( Exception  eE )
>                   {}
>                 //@@@FTG
1216,1217c1229,1243
<                     packet.writeString(database);
<                 }
---
>                     //@@@FTG  packet.writeString() assumes that native
>                     //@@@FTG  platform uses a variant of latin1 (Cp1252).
>                     //@@@FTG  We'll substitiute a writeStringNoNull call
>                     //@@@FTG  with an explicit Cp1252 encoding to force
>                     //@@@FTG  conversion.
>                     //@@@FTG  commented: packet.writeString(database);
>                     try
>                      {
>                       packet.writeStringNoNull(database,"Cp1252",null,false);
>                       packet.writeByte((byte)0);
>                      }
>                      catch( Exception  eE )
>                       {}
>                     //@@@FTG
>         }
2882c2908,2918
<                 serverErrorMessage = resultPacket.readString();
---
>                 //@@@FTG  Read the message explicitly as Cp1252 in
>                 //@@@FTG  case the platform encoding doesn't match.
>                 //@@@FTG  commentd: serverErrorMessage = resultPacket.readString();
>                 serverErrorMessage = "????";
>                 try
>                  {
>                   serverErrorMessage = resultPacket.readString( "Cp1252" );
>                  }
>                  catch( Exception  eE )
>                   {}
>                 //@@@FTG
3470c3506,3519
<             packet.writeString(database);
---
>                 //@@@FTG  packet.writeString() assumes that native
>                 //@@@FTG  platform uses a variant of latin1 (Cp1252).
>                 //@@@FTG  We'll substitiute a writeStringNoNull call
>                 //@@@FTG  with an explicit Cp1252 encoding to force
>                 //@@@FTG  conversion.
>                 //@@@FTG  commented: packet.writeString(database);
>                 try
>                  {
>                   packet.writeStringNoNull(database,"Cp1252",null,false);
>                   packet.writeByte((byte)0);
>                  }
>                  catch( Exception  eE )
>                   {}
>                 //@@@FTG
3642c3691,3704
<             packet.writeString(database);
---
>                 //@@@FTG  packet.writeString() assumes that native
>                 //@@@FTG  platform uses a variant of latin1 (Cp1252).
>                 //@@@FTG  We'll substitiute a writeStringNoNull call
>                 //@@@FTG  with an explicit Cp1252 encoding to force
>                 //@@@FTG  conversion.
>                 //@@@FTG  commented: packet.writeString(database);
>                 try
>                  {
>                   packet.writeStringNoNull(database,"Cp1252",null,false);
>                   packet.writeByte((byte)0);
>                  }
>                  catch( Exception  eE )
>                   {}
>                 //@@@FTG
4267c4329,4342
<                 packet.writeString(database);
---
>                 //@@@FTG  packet.writeString() assumes that native
>                 //@@@FTG  platform uses a variant of latin1 (Cp1252).
>                 //@@@FTG  We'll substitiute a writeStringNoNull call
>                 //@@@FTG  with an explicit Cp1252 encoding to force
>                 //@@@FTG  conversion.
>                 //@@@FTG  commented: packet.writeString(database);
>                 try
>                  {
>                   packet.writeStringNoNull(database,"Cp1252",null,false);
>                   packet.writeByte((byte)0);
>                  }
>                  catch( Exception  eE )
>                   {}
>                 //@@@FTG
[8 Mar 2006 23:24] Mark Matthews
Are you fiddling with the JVM default encoding? If not, then it sounds like your JVM is
broken, because String.getBytes() doesn't return ASCII, it returns the string's bytes as
encoded in the platform default encoding:

http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#getBytes()

See http://java.sun.com/j2se/corejava/intl/reference/faqs/index.html for information on
how the JVM chooses which encoding to use by default.

If indeed your JVM is using the correct character set, then please get back to us, and
we'll debug further.
[9 Mar 2006 0:02] Frank Griffin
>Are you fiddling with the JVM default encoding? 
Nope.

>If not, then it sounds
>like your JVM is broken, because String.getBytes() doesn't return
>ASCII, it returns the string's bytes as encoded in the platform default
>encoding:

That's the problem.  The database name is extracted from the URL into a String, and
ByteArrayBuffer.writeString( string ) uses getBytes() to get the bytes it writes into the
packet for the server.  Those bytes are in the platform default encoding, which for z/OS
is Cp1047 (an EBCDIC superset).  However, the server is expecting the bytes to be "latin1"
(8859-1 or Cp1252), which is hard-coded elsewhere in MysqlIO by writing an '08' byte into
to the handshake packet, so what the server sees is garbage.

Upon return of an error, readString() does the same thing, using the String constructor
which assumes a byte array in the platform default encoding and converts to Unicode.  The
problem is that the byte array sent by the server is Cp1252, not Cp1047.  By a happy
coincidence, when the JVM reconverts the resulting String to Cp1047 for display on
System.out, its Unicode -> Cp1047 undoes the readString() transform and you end up with
the original Cp1252 bytes in the output.  Of course, these display as garbage on the z/OS
(which thinks they're Cp1047), but at least the hex is correct ASCII.
[16 Jun 2006 19:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7772
[17 Oct 2007 14:16] Frank Griffin
Mark, I just got back to trying this again.  I assume the driver version I have (5.0.7)
has your patch, because I'm not seeing the symptoms I was before.

However, I'm getting password verification failures.  I've defined user XYZ with password
ABC on the server (4.0.18), and specified Host values of localhost, 127.0.0.1, actual
server hostname, and %.

If I connect from the mysql client running on another linux system, it works fine.  But
connecting through the connector from z/OS gives:

071017  8:12:27       2 Connect     Access denied for user: 'XYZ@192.168.100.73' (Using
password: YES)

Can you please check to see if your patch ships the password properly for non-ASCII
systems ?  Apparently the userid is going over OK, because it shows up properly in both
the error message on z/OS and the mysql log file on linux, but I suspect that the password
is getting encrypted for transmission from the EBCDIC and when the server decrypts it, he
expects ASCII.

Thanks.
[8 Jan 2008 7:20] Mark Matthews
Frank,

If possible, could you try with 5.1? It's setup to use UTF-8 as the character set for
authentication for MySQL-4.1 and newer, which should clear up this issue for you on
EBCIDIC platforms.
[9 Feb 2008 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[12 Feb 2008 20:28] Frank Griffin
Mark, I'm not sure from your comment whether this is expected to work or not, but I tested
with the 5.1 JDBC Driver against a server running 5.0.45.

The failure now is an SQLSTATE 08S01 claiming the connection drops.  If I monitor port
3306 on the server with a sniffer, Wireshark (I assume it's wireshark doing this since
this packet is from the driver) is claiming that the Login Request packet is malformed,
although it displays the correct userid.

The hex of the packet is:

0000  00 11 11 75 24 1a 00 a0  8e 01 96 15 08 00 45 00   ...u$... ......E.
0010  00 54 80 f8 00 00 3d 06  ac 6d ac 1d 7f 54 c0 a8   .T....=. .m...T..
0020  64 24 08 af 0c ea 41 83  f5 3a e5 4e c6 cb 80 18   d$....A. .:.N....
0030  7f c8 2b 6d 00 00 01 01  08 0a c2 c0 9b 57 03 0c   ..+m.... .....W..
0040  b6 52 1c 00 00 01 8f 80  ff 02 fd 50 55 42 43 56   .R...... ...PUBCV
0050  53 00 78 78 78 78 78 78  78 78 00 43 56 53 53 51   S.xxxxxx xx.CVSSQ
0060  4c 00                                              L.               

As you can see, the password is given as 8 'x' bytes, but the actual password is 9 bytes
long and alphanumeric.

MySQL responds with a Response Error 1043 (Bad Handshake):

0000  00 a0 8e 01 96 15 00 11  11 75 24 1a 08 00 45 08   ........ .u$...E.
0010  00 48 70 e2 40 00 40 06  79 87 c0 a8 64 24 ac 1d   .Hp.@.@. y...d$..
0020  7f 54 0c ea 08 af e5 4e  c6 cb 41 83 f5 5a 80 18   .T.....N ..A..Z..
0030  00 2e 50 79 00 00 01 01  08 0a 03 0c b6 ae c2 c0   ..Py.... ........
0040  9b 57 10 00 00 02 ff 13  04 42 61 64 20 68 61 6e   .W...... .Bad han
0050  64 73 68 61 6b 65                                  dshake