Bug #80631 | ResultSet.getString return garbled result with json type data | ||
---|---|---|---|
Submitted: | 7 Mar 2016 8:01 | Modified: | 14 Sep 2016 1:51 |
Reporter: | Tianming Yi | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | Connector / J | Severity: | S2 (Serious) |
Version: | 5.1.38, 5.1.39, 6.0.3 | OS: | Any |
Assigned to: | Filipe Silva | CPU Architecture: | Any |
Tags: | Connector/J, garbled, json, utf8 |
[7 Mar 2016 8:01]
Tianming Yi
[8 Mar 2016 9:44]
Chiranjeevi Battula
Hello Tianming Yi, Thank you for the bug report. Verified this behavior on MySQL Connector / J 5.1.38. Thanks, Chiranjeevi.
[8 Mar 2016 9:45]
Chiranjeevi Battula
output: run: {"key2": "value2", "éå¤§è¶ ": "value1"} {"key4": "value4", "ä¸å½": "value3"} BUILD SUCCESSFUL (total time: 0 seconds)
[15 Mar 2016 18:25]
Filipe Silva
Hi Tianming Yi, Can you provide some more info about your environment: - Connection string properties? - What's the server version you are using? - Server encoding? Database encoding? Client encoding? (If in doubt, can you just paste the result of "show variables like 'character_set%'"?) Have you tried setting the connection property "characterEncoding=UTF-8"?
[27 Mar 2016 7:02]
Vladimir V
Hi! I got the same problem, settings does not help. The correct header for this bug is "JSON does not allow Unicode characters" As JSON is binary type - it is not known how to store unicode inside it. Here is my test table: CREATE TABLE utilJSON ( field1 JSON ) CHARACTER SET utf8, ENGINE=INNODB; The Connector / J uses "iso-8859-1" so no unicode characters: // inside statement.executeQuery("select field1 from utilJSON") // com/mysql/jdbc/MysqlIO.java: short charSetNumber = (short) packet.readInt(); we got charSetNumber = 63 // Lets look at com/mysql/jdbc/CharsetMapping.java // 63 means binary: collation[63] = new Collation(63, "binary", 1, MYSQL_CHARSET_NAME_binary); //and mysql binary means ISO8859_1 java new MysqlCharset(MYSQL_CHARSET_NAME_binary, 1, 1, new String[] { "ISO8859_1" }), ---------- //temp workaround, you can read value like this: new String(fieldVal.getBytes("iso-8859-1"))
[27 Mar 2016 9:25]
Vladimir V
Found some documentation https://dev.mysql.com/doc/refman/5.7/en/json.html "MySQL handles strings used in JSON context using the utf8mb4 character set and utf8mb4_bin collation. " So unicode is ok for JSON field. j/connector have to use utf8mb4_bin instead if iso-8859-1 for binary data in json field.
[25 Apr 2016 7:39]
Chiranjeevi Battula
http://bugs.mysql.com/bug.php?id=81188 marked as duplicate of this one.
[29 Apr 2016 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[13 May 2016 10:33]
Hitoshi Asai
I can reproduce it in another setting for Japanese, my environments are following, - Connection string properties? - jdbc:mysql://127.0.0.1:3306/test?characterEncoding=UTF-8&useSSL=false - What's the server version you are using? - MySQL 5.7.12 (Installed from MySQL APT repo) / Ubuntu 16.04 - Server encoding? Database encoding? Client encoding? (If in doubt, can you just paste the result of "show variables like 'character_set%'"?) - Server locale is "ja_JP.UTF-8" and the result of "show variables.." is +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ repro steps: CREATE DATABASE test CHARACTER SET utf8; USE test CREATE TABLE json_test ( id int PRIMARY KEY NOT NULL, varchar_column varchar(100) NOT NULL, json_column json NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; INSERT INTO json_test (id, varchar_column, json_column) VALUES (1, '中文', '{"id": "中文"}'); INSERT INTO json_test (id, varchar_column, json_column) VALUES (2, '中文', '{"id": "Chinese"}'); INSERT INTO json_test (id, varchar_column, json_column) VALUES (3, 'Chinese', '{"id": "中文"}'); INSERT INTO json_test (id, varchar_column, json_column) VALUES (4, 'Chinese', '{"id": "Chinese"}'); ** MySQL Client shows '中文' correctly ** mysql> select * from json_test; +----+----------------+-------------------+ | id | varchar_column | json_column | +----+----------------+-------------------+ | 1 | 中文 | {"id": "中文"} | | 2 | 中文 | {"id": "Chinese"} | | 3 | Chinese | {"id": "中文"} | | 4 | Chinese | {"id": "Chinese"} | +----+----------------+-------------------+ ** Connector/J (mysql-connector-java-5.1.39.jar, Oracle JDK 8u92) is not ** source: import java.sql.*; public class Main { public static void main(String[] args) throws ClassNotFoundException, SQLException { Class.forName("com.mysql.jdbc.Driver"); Connection conn = DriverManager.getConnection("jdbc:mysql://localhost:3306/test?characterEncoding=UTF-8&useSSL=false", "user", "password"); Statement stmt = conn.createStatement(); ResultSet rset = stmt.executeQuery("SELECT * FROM json_test"); while (rset.next()) { System.out.println(rset.getInt(1) + "\t" + rset.getString(2) + "\t" + rset.getString(3)); } rset.close(); stmt.close(); conn.close(); } } result: 1 中文 {"id": "ä¸æ"} 2 中文 {"id": "Chinese"} 3 Chinese {"id": "ä¸æ"} 4 Chinese {"id": "Chinese"} (varchar column displayed correctly, json column always corrupted)
[17 May 2016 10:01]
MySQL Verification Team
-- with the latest test case from earlier note [umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: cat Main.java import java.sql.*; public class Main { public static void main(String[] args) throws ClassNotFoundException, SQLException { Class.forName("com.mysql.jdbc.Driver"); Connection conn = DriverManager.getConnection("jdbc:mysql://localhost:15000/test?characterEncoding=UTF-8&useSSL=false", "root", ""); Statement stmt = conn.createStatement(); ResultSet rset = stmt.executeQuery("SELECT * FROM json_test"); while (rset.next()) { System.out.println(rset.getInt(1) + "\t" + rset.getString(2) + "\t" + rset.getString(3)); } rset.close(); stmt.close(); conn.close(); } } [umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: bin/mysql -uroot -S /tmp/mysql_ushastry.sock Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.7.12-enterprise-commercial-advanced MySQL Enterprise Server - Advanced Edition (Commercial) Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> create database test; Query OK, 1 row affected (0.00 sec) mysql> use test Database changed mysql> CREATE TABLE json_test ( -> id int PRIMARY KEY NOT NULL, -> varchar_column varchar(100) NOT NULL, -> json_column json NOT NULL -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8; Query OK, 0 rows affected (0.01 sec) mysql> mysql> INSERT INTO json_test (id, varchar_column, json_column) VALUES (1, '中文', '{"id": "中文"}'); INSERT INTO json_test (id, varchar_column, json_column) VALUES (2, '中文', '{"id": "Chinese"}'); INSERT INTO json_test (id, varchar_column, json_column) VALUES (3, 'Chinese', '{"id": "中文"}'); INSERT INTO json_test (id, varchar_column, json_column) VALUES (4, 'Chinese', '{"id": "Chinese"}'); Query OK, 1 row affected (0.02 sec) mysql> INSERT INTO json_test (id, varchar_column, json_column) VALUES (2, '中文', '{"id": "Chinese"}'); Query OK, 1 row affected (0.00 sec) mysql> INSERT INTO json_test (id, varchar_column, json_column) VALUES (3, 'Chinese', '{"id": "中文"}'); Query OK, 1 row affected (0.00 sec) mysql> INSERT INTO json_test (id, varchar_column, json_column) VALUES (4, 'Chinese', '{"id": "Chinese"}'); Query OK, 1 row affected (0.00 sec) mysql> mysql> select * from json_test; +----+----------------+-------------------+ | id | varchar_column | json_column | +----+----------------+-------------------+ | 1 | 中文 | {"id": "中文"} | | 2 | 中文 | {"id": "Chinese"} | | 3 | Chinese | {"id": "中文"} | | 4 | Chinese | {"id": "Chinese"} | +----+----------------+-------------------+ 4 rows in set (0.00 sec) mysql> \q Bye [umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: vi Main.java [umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: javac -cp '.:/home/umshastr/bugs/mysql-connector-java-5.1.38/mysql-connector-java-5.1.38-bin.jar' Main.java [umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: java -cp '.:/home/umshastr/bugs/mysql-connector-java-5.1.38/mysql-connector-java-5.1.38-bin.jar' Main 1 中文 {"id": "ä¸æ��"} 2 中文 {"id": "Chinese"} 3 Chinese {"id": "ä¸æ��"} 4 Chinese {"id": "Chinese"} [umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: [umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: javac -cp '.:/home/umshastr/bugs/mysql-connector-java-5.1.39/mysql-connector-java-5.1.39-bin.jar' Main.java [umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: java -cp '.:/home/umshastr/bugs/mysql-connector-java-5.1.39/mysql-connector-java-5.1.39-bin.jar' Main 1 中文 {"id": "ä¸æ��"} 2 中文 {"id": "Chinese"} 3 Chinese {"id": "ä¸æ��"} 4 Chinese {"id": "Chinese"} [umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12:
[24 May 2016 2:23]
Guangpu Feng
Hi, Vladimir V Thanks for your workaround, it helps. //temp workaround, you can read value like this: new String(fieldVal.getBytes("iso-8859-1")) I think this is a serious bug that should be fixed, the above code should be fixed back when bugfix released in later version.
[23 Jun 2016 8:28]
Yanming Zhou
I can reproduce this issue with version 5.1.39, please fix it asap.
[23 Jun 2016 12:21]
OCA Admin
Contribution submitted via Github - //Fix for bug #80631 Any ResultSet.getString return garbled result wi… (*) Contribution by Dong SongLing (Github neil4dong, mysql-connector-j/pull/9#issuecomment-227795293): I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: git_patch_70847180.txt (text/plain), 8.41 KiB.
[27 Jun 2016 0:56]
Tsubasa Tanaka
This affects Connector/J 6.0.3-m2 too(please add 6.0.3 to effected version)
[23 Aug 2016 7:10]
Leif Bladt
Any news on this?
[24 Aug 2016 8:25]
Leif Bladt
I can confirm, that the patch works at least for german umlaut characters.
[14 Sep 2016 1:51]
Daniel So
Posted by developer: Added the following entry to the Connector/J 5.1.40 changelog: "ResultSet.getString() sometimes returned garbled data for columns of the JSON data type. This was because JSON data was binary encoded by MySQL using the utf8mb4 character set, but decoded by Connector/J using the ISO-8859-1 character set. This patch fixes the decoding for JSON data. Thanks to Dong Song Ling for contributing the patch. "
[30 Sep 2016 16:42]
Daniel So
Posted by developer: Some minors edits to the changelog entry: "ResultSet.getString() sometimes returned garbled data for columns of the JSON data type. This was because JSON data was binary encoded by MySQL using the utf8mb4 character set, but decoded by Connector/J using the ISO-8859-1 character set. This patch fixes the decoding for JSON data. Thanks to Dong Song Ling for contributing to the fix."
[14 Oct 2016 16:02]
Daniel So
The fix has been included in Connector/J 6.0.5. The entry for the 5.1.40 changelog has been included into the 6.0.5 changelog.