| Bug #80631 | ResultSet.getString return garbled result with json type data | ||
|---|---|---|---|
| Submitted: | 7 Mar 2016 8:01 | Modified: | 14 Sep 2016 1:51 |
| Reporter: | Tianming Yi | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | Connector / J | Severity: | S2 (Serious) |
| Version: | 5.1.38, 5.1.39, 6.0.3 | OS: | Any |
| Assigned to: | Filipe Silva | CPU Architecture: | Any |
| Tags: | Connector/J, garbled, json, utf8 | ||
[8 Mar 2016 9:44]
Chiranjeevi Battula
Hello Tianming Yi, Thank you for the bug report. Verified this behavior on MySQL Connector / J 5.1.38. Thanks, Chiranjeevi.
[8 Mar 2016 9:45]
Chiranjeevi Battula
output:
run:
{"key2": "value2", "é大è¶
": "value1"}
{"key4": "value4", "ä¸å½": "value3"}
BUILD SUCCESSFUL (total time: 0 seconds)
[15 Mar 2016 18:25]
Filipe Silva
Hi Tianming Yi, Can you provide some more info about your environment: - Connection string properties? - What's the server version you are using? - Server encoding? Database encoding? Client encoding? (If in doubt, can you just paste the result of "show variables like 'character_set%'"?) Have you tried setting the connection property "characterEncoding=UTF-8"?
[27 Mar 2016 7:02]
Vladimir V
Hi! I got the same problem, settings does not help.
The correct header for this bug is "JSON does not allow Unicode characters"
As JSON is binary type - it is not known how to store unicode inside it.
Here is my test table:
CREATE TABLE utilJSON (
field1 JSON
) CHARACTER SET utf8, ENGINE=INNODB;
The Connector / J uses "iso-8859-1" so no unicode characters:
// inside statement.executeQuery("select field1 from utilJSON")
// com/mysql/jdbc/MysqlIO.java:
short charSetNumber = (short) packet.readInt();
we got charSetNumber = 63
// Lets look at com/mysql/jdbc/CharsetMapping.java
// 63 means binary:
collation[63] = new Collation(63, "binary", 1, MYSQL_CHARSET_NAME_binary);
//and mysql binary means ISO8859_1 java
new MysqlCharset(MYSQL_CHARSET_NAME_binary, 1, 1, new String[] { "ISO8859_1" }),
----------
//temp workaround, you can read value like this:
new String(fieldVal.getBytes("iso-8859-1"))
[27 Mar 2016 9:25]
Vladimir V
Found some documentation https://dev.mysql.com/doc/refman/5.7/en/json.html "MySQL handles strings used in JSON context using the utf8mb4 character set and utf8mb4_bin collation. " So unicode is ok for JSON field. j/connector have to use utf8mb4_bin instead if iso-8859-1 for binary data in json field.
[25 Apr 2016 7:39]
Chiranjeevi Battula
http://bugs.mysql.com/bug.php?id=81188 marked as duplicate of this one.
[29 Apr 2016 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[13 May 2016 10:33]
Hitoshi Asai
I can reproduce it in another setting for Japanese, my environments are following,
- Connection string properties?
- jdbc:mysql://127.0.0.1:3306/test?characterEncoding=UTF-8&useSSL=false
- What's the server version you are using?
- MySQL 5.7.12 (Installed from MySQL APT repo) / Ubuntu 16.04
- Server encoding? Database encoding? Client encoding? (If in doubt, can you just paste the result of "show variables like 'character_set%'"?)
- Server locale is "ja_JP.UTF-8" and the result of "show variables.." is
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
repro steps:
CREATE DATABASE test CHARACTER SET utf8;
USE test
CREATE TABLE json_test (
id int PRIMARY KEY NOT NULL,
varchar_column varchar(100) NOT NULL,
json_column json NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO json_test (id, varchar_column, json_column) VALUES (1, '中文', '{"id": "中文"}');
INSERT INTO json_test (id, varchar_column, json_column) VALUES (2, '中文', '{"id": "Chinese"}');
INSERT INTO json_test (id, varchar_column, json_column) VALUES (3, 'Chinese', '{"id": "中文"}');
INSERT INTO json_test (id, varchar_column, json_column) VALUES (4, 'Chinese', '{"id": "Chinese"}');
** MySQL Client shows '中文' correctly **
mysql> select * from json_test;
+----+----------------+-------------------+
| id | varchar_column | json_column |
+----+----------------+-------------------+
| 1 | 中文 | {"id": "中文"} |
| 2 | 中文 | {"id": "Chinese"} |
| 3 | Chinese | {"id": "中文"} |
| 4 | Chinese | {"id": "Chinese"} |
+----+----------------+-------------------+
** Connector/J (mysql-connector-java-5.1.39.jar, Oracle JDK 8u92) is not **
source:
import java.sql.*;
public class Main {
public static void main(String[] args) throws ClassNotFoundException, SQLException {
Class.forName("com.mysql.jdbc.Driver");
Connection conn = DriverManager.getConnection("jdbc:mysql://localhost:3306/test?characterEncoding=UTF-8&useSSL=false", "user", "password");
Statement stmt = conn.createStatement();
ResultSet rset = stmt.executeQuery("SELECT * FROM json_test");
while (rset.next()) {
System.out.println(rset.getInt(1) + "\t" + rset.getString(2) + "\t" + rset.getString(3));
}
rset.close();
stmt.close();
conn.close();
}
}
result:
1 中文 {"id": "䏿"}
2 中文 {"id": "Chinese"}
3 Chinese {"id": "䏿"}
4 Chinese {"id": "Chinese"}
(varchar column displayed correctly, json column always corrupted)
[17 May 2016 10:01]
MySQL Verification Team
-- with the latest test case from earlier note
[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: cat Main.java
import java.sql.*;
public class Main {
public static void main(String[] args) throws ClassNotFoundException, SQLException {
Class.forName("com.mysql.jdbc.Driver");
Connection conn = DriverManager.getConnection("jdbc:mysql://localhost:15000/test?characterEncoding=UTF-8&useSSL=false", "root", "");
Statement stmt = conn.createStatement();
ResultSet rset = stmt.executeQuery("SELECT * FROM json_test");
while (rset.next()) {
System.out.println(rset.getInt(1) + "\t" + rset.getString(2) + "\t" + rset.getString(3));
}
rset.close();
stmt.close();
conn.close();
}
}
[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: bin/mysql -uroot -S /tmp/mysql_ushastry.sock
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.12-enterprise-commercial-advanced MySQL Enterprise Server - Advanced Edition (Commercial)
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> create database test;
Query OK, 1 row affected (0.00 sec)
mysql> use test
Database changed
mysql> CREATE TABLE json_test (
-> id int PRIMARY KEY NOT NULL,
-> varchar_column varchar(100) NOT NULL,
-> json_column json NOT NULL
-> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.01 sec)
mysql>
mysql> INSERT INTO json_test (id, varchar_column, json_column) VALUES (1, '中文', '{"id": "中文"}');
INSERT INTO json_test (id, varchar_column, json_column) VALUES (2, '中文', '{"id": "Chinese"}');
INSERT INTO json_test (id, varchar_column, json_column) VALUES (3, 'Chinese', '{"id": "中文"}');
INSERT INTO json_test (id, varchar_column, json_column) VALUES (4, 'Chinese', '{"id": "Chinese"}');
Query OK, 1 row affected (0.02 sec)
mysql> INSERT INTO json_test (id, varchar_column, json_column) VALUES (2, '中文', '{"id": "Chinese"}');
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO json_test (id, varchar_column, json_column) VALUES (3, 'Chinese', '{"id": "中文"}');
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO json_test (id, varchar_column, json_column) VALUES (4, 'Chinese', '{"id": "Chinese"}');
Query OK, 1 row affected (0.00 sec)
mysql>
mysql> select * from json_test;
+----+----------------+-------------------+
| id | varchar_column | json_column |
+----+----------------+-------------------+
| 1 | 中文 | {"id": "中文"} |
| 2 | 中文 | {"id": "Chinese"} |
| 3 | Chinese | {"id": "中文"} |
| 4 | Chinese | {"id": "Chinese"} |
+----+----------------+-------------------+
4 rows in set (0.00 sec)
mysql> \q
Bye
[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: vi Main.java
[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: javac -cp '.:/home/umshastr/bugs/mysql-connector-java-5.1.38/mysql-connector-java-5.1.38-bin.jar' Main.java
[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: java -cp '.:/home/umshastr/bugs/mysql-connector-java-5.1.38/mysql-connector-java-5.1.38-bin.jar' Main
1 中文 {"id": "䏿��"}
2 中文 {"id": "Chinese"}
3 Chinese {"id": "䏿��"}
4 Chinese {"id": "Chinese"}
[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12:
[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: javac -cp '.:/home/umshastr/bugs/mysql-connector-java-5.1.39/mysql-connector-java-5.1.39-bin.jar' Main.java
[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12: java -cp '.:/home/umshastr/bugs/mysql-connector-java-5.1.39/mysql-connector-java-5.1.39-bin.jar' Main
1 中文 {"id": "䏿��"}
2 中文 {"id": "Chinese"}
3 Chinese {"id": "䏿��"}
4 Chinese {"id": "Chinese"}
[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12:
[24 May 2016 2:23]
Guangpu Feng
Hi, Vladimir V
Thanks for your workaround, it helps.
//temp workaround, you can read value like this:
new String(fieldVal.getBytes("iso-8859-1"))
I think this is a serious bug that should be fixed, the above code should be fixed back when bugfix released in later version.
[23 Jun 2016 8:28]
Yanming Zhou
I can reproduce this issue with version 5.1.39, please fix it asap.
[23 Jun 2016 12:21]
OCA Admin
Contribution submitted via Github - //Fix for bug #80631 Any ResultSet.getString return garbled result wi… (*) Contribution by Dong SongLing (Github neil4dong, mysql-connector-j/pull/9#issuecomment-227795293): I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: git_patch_70847180.txt (text/plain), 8.41 KiB.
[27 Jun 2016 0:56]
Tsubasa Tanaka
This affects Connector/J 6.0.3-m2 too(please add 6.0.3 to effected version)
[23 Aug 2016 7:10]
Leif Bladt
Any news on this?
[24 Aug 2016 8:25]
Leif Bladt
I can confirm, that the patch works at least for german umlaut characters.
[14 Sep 2016 1:51]
Daniel So
Posted by developer: Added the following entry to the Connector/J 5.1.40 changelog: "ResultSet.getString() sometimes returned garbled data for columns of the JSON data type. This was because JSON data was binary encoded by MySQL using the utf8mb4 character set, but decoded by Connector/J using the ISO-8859-1 character set. This patch fixes the decoding for JSON data. Thanks to Dong Song Ling for contributing the patch. "
[30 Sep 2016 16:42]
Daniel So
Posted by developer: Some minors edits to the changelog entry: "ResultSet.getString() sometimes returned garbled data for columns of the JSON data type. This was because JSON data was binary encoded by MySQL using the utf8mb4 character set, but decoded by Connector/J using the ISO-8859-1 character set. This patch fixes the decoding for JSON data. Thanks to Dong Song Ling for contributing to the fix."
[14 Oct 2016 16:02]
Daniel So
The fix has been included in Connector/J 6.0.5. The entry for the 5.1.40 changelog has been included into the 6.0.5 changelog.

Description: I stored Chinese character in json type column. Data stored in mysql is correct encoded with utf-8. But in ResultSet.getString(), the result is garbled. In contract, Chinese character stored in text type column is not garbled. I tried ResultSet.getBytes() then new String(bytes, "utf-8"), the string is correct encoded. How to repeat: Store some Chinese character, like 中国 in json type column. get connection with jdbc driver, execute query, ResultSet rs = pstmt.executeQuery(), then print rs.getString(columnName). The result is like: {"id": "������"} And it should be: {"id": "中文"}