Bug #12030 CONCAT with INTEGER modifies result-set encoding
Submitted: 19 Jul 2005 12:19 Modified: 21 Jun 2010 7:47
Reporter: Tom Schindl Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:5.0 OS:Any
Assigned to: Alexander Barkov CPU Architecture:Any

[19 Jul 2005 12:19] Tom Schindl
Description:
I'm not 100% sure if this is a Server or JDBC-Driver bug. The following 2 queries concating values produce different results.

----------------------8<----------------------
SELECT CONCAT(' ',CONCAT(' - ',p_givenname)) FROM person p
----------------------8<----------------------
196
214

  - Ä => key code for 5th 196
  - Ö => key code for 5th 214
  - Ü => key code for 5th 220

Which is right.

----------------------8<----------------------
SELECT CONCAT(p_id,CONCAT(' - ',p_givenname)) FROM person p
----------------------8<----------------------
1 - ?? => key code for 5th 65533
2 - ?? => key code for 5th 65533
3 - ?? => key code for 5th 65533

Which is not approrpiate.

How to repeat:
// USE THE ATTACHED DATABASE
try {
    Class.forName("com.mysql.jdbc.Driver").newInstance();
    Connection connection = DriveManager.getConnection("jdbc:mysql://localhost/test2?user=root&password=roonong");
    Statement stmt = connection.createStatement();
    ResultSet set = stmt.executeQuery("SELECT CONCAT(p_id,CONCAT(' - ',p_givenname)) FROM person p");
		
    while(set.next()) {
        System.err.println((int)set.getString(1).charAt(4));
    }

    set = stmt.executeQuery("SELECT CONCAT(' ',CONCAT(' - ',p_givenname)) FROM person p");

    while(set.next()) {
        System.err.println((int)set.getString(1).charAt(4));
    }
} catch (InstantiationException e) {
    e.printStackTrace();
} catch (IllegalAccessException e) {
    e.printStackTrace();
} catch (ClassNotFoundException e) {
    e.printStackTrace();
} catch (SQLException e) {
    e.printStackTrace();
}
[19 Jul 2005 12:20] Tom Schindl
the database

Attachment: mysql-concat.dump (application/octet-stream, text), 2.11 KiB.

[19 Jul 2005 12:22] Tom Schindl
language setting

Attachment: vars.dump (application/octet-stream, text), 246 bytes.

[19 Jul 2005 12:51] Tom Schindl
Oh. And if I remember correctly it worked on mysql-4.0 using the same JDBC-Driver
[19 Jul 2005 12:58] Mark Matthews
Looks like a dupe of a few bugs related to unsigned values that's been fixed for 3.1.11. Please test with a nightly snapshot of 3.1.11 from http://downloads.mysql.com/snapshots.php#connector-j and let us know if it fixes the problem for you.
[19 Jul 2005 13:21] Tom Schindl
Non. There's no difference with the nightly build, as already stated it worked on MySQL-4.0 and since I upgrade my test-station to 4.1 it doesn't work any more. Maybe one more information: the same result is with JDBC-3.0.16. It also doesn't help to use any connection setting like useUnicode, characterEncoding, ... . I'm afraid, that this is not really a JDBC Bug and MySQL somehow changes the CharSet or something like this
[19 Jul 2005 15:10] Aleksey Kishkin
screenshoot in the query browser

Attachment: bug12030.PNG (image/png, text), 57.97 KiB.

[19 Jul 2005 15:11] Aleksey Kishkin
I would say that mysql works properly when I test this query in query browser. (see attached file)
[19 Jul 2005 16:51] Tom Schindl
Yes that's true for me too still I think that the query browser knows how to deal with those different charsets i suppose the problem is comeing from. This behaviour looks for me similar to the one I had with high bit chars like € in 4.0. And it really is interesting that the JDBC-Driver can handle all those chars beside in situations where you concat it with a non-text column. Isn't it possible that the charset is then modified, i think the JDBC-Driver does not now that the there has been a concat with an int-column or another varchar... But that's only a guess? What also make me think is that even the old-jdbc-driver returns the same result. Would a tcp-dump help you to find the source of the problem.
[20 Jul 2005 14:11] Tom Schindl
More information on this topic: When trying to change the collation of the resulting values using COLLATE utf8_bin after the CONCAT it works for the CHAR,CHAR-Concat and fails for the INT, CHAR concat with the following message: COLLATION 'utf8_bin' is not valid for CHARACTER SET 'binary'
The two SQL-Statements are:
- "SELECT CONCAT(p_id,CONCAT(' - ',p_givenname)) COLLATE utf8_bin FROM person p"
- "SELECT CONCAT(' ',CONCAT(' - ',p_givenname)) COLLATE utf8_bin FROM person p"
[27 Jul 2005 8:27] Aleksey Kishkin
Well. confirm that character set is changed.

mysql> insert into person values('1','ñ'),(2,'æ');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT CONCAT(' ',CONCAT(' - ',p_givenname)) FROM person p;
+---------------------------------------+
| CONCAT(' ',CONCAT(' - ',p_givenname)) |
+---------------------------------------+
|   - ñ                                 |
|   - æ                                 |
+---------------------------------------+
2 rows in set (0.00 sec)

mysql> SELECT CONCAT(p_id,CONCAT(' - ',p_givenname)) FROM person p;
+----------------------------------------+
| CONCAT(p_id,CONCAT(' - ',p_givenname)) |
+----------------------------------------+
| 1 - Я                                 |
| 2 - Ф                                 |
+----------------------------------------+
2 rows in set (0.01 sec)
[27 Jul 2005 8:34] Aleksey Kishkin
testcase

Attachment: bug12030.sql (text/x-sql), 409 bytes.

[27 Jul 2005 8:41] Tom Schindl
But using the QueryBrowser the everything is showing up correctly. For me this means that there are 2 different problems:
- mysql silently changes the charcterset
- JDBC-Driver has problems with the resulting charset
[28 Jul 2005 3:30] Alexander Barkov
This problem will be fixed later in 5.1.
A temporarily workaround is to use explicit CAST:

SELECT CONCAT(CAST(p_id AS CHAR),CONCAT(' - ',p_givenname)) FROM person p;
[28 Jul 2005 7:41] Tom Schindl
Still there remains the problem with the JDBC-Driver which certainly has a bug with the resulting encoding which should be fixed. Should I open a new bug for this?
[28 Jul 2005 7:48] Alexander Barkov
Tom, I'm reassigning this bug to Mark Matthews,
the maintainer of JDBC driver.
[28 Jul 2005 12:42] Mark Matthews
Bar, sorry, but I just checked the JDBC driver, and I feel that this is indeed a server bug, and not a JDBC driver bug.

The server is returning the incorrect character set ( "binary" ) when the CONCAT() with the integer field happens., 

The "binary" character encoding  has _no_ mapping in Java (the closest we can come is "us-ascii"). The reason QueryBrowser can display this is that it just treats bytes as bytes because it's using C string functions and passes them back up to the OS's display routines. _If_ the default character set for the operating system matches, it _appears_ that things are okay. 

If, for example Tom was to have his OS character set configured as something that doesn't understand iso8859-1 characters, he would see corruption in QueryBrowser too.

Treating the bytes as bytes is just not possible in Java, as all Strings are composed of Unicode characters, so the driver has to map from byte[]s to Unicode characters, and must rely on the server to give the driver the approriate information to make that conversion.

The server needs to return a character encoding that is correct for the JDBC driver to handle the result set value correctly.
[28 Jul 2005 15:00] Alexander Barkov
Tom, I'm sorry, I probably misunderstood you.
I thought that even with CAST you still get wrong data.
Can you please confirm that? Thank you!
[28 Jul 2005 15:28] Tom Schindl
using the cast works perfectly I only thought that there's maybe a bug in the JDBC-Driver which Mark clear up there's nothing Java can do. So you can savely target the bug to 5.1. Maybe it should be added to the docs where concat is described? Maybe here: http://dev.mysql.com/doc/mysql/en/string-functions.html
[28 Jul 2005 18:57] Alexander Barkov
Tom, I wrote a letter to the documentation team.
[19 Jul 2007 22:42] Jim Winstead
Bug #27862 was marked (in part) as a duplicate of this bug.
[29 Sep 2008 21:59] Konstantin Osipov
5.1 time has come, setting bug to verified.
[5 Dec 2008 10:39] Alexander Barkov
Workaround:

SELECT CONCAT(CAST(p_id AS CHAR),CONCAT(' - ',p_givenname)) FROM person p;
[21 Jun 2010 7:47] Alexander Barkov
Fixed in MySQL-5.5.

For details see:
http://forge.mysql.com/worklog/task.php?id=2649
[19 Nov 2012 12:32] Tomas Telensky
Not only INTEGER causes a problem, but TIMESTAMP also.

http://stackoverflow.com/q/6397156/684229