| Bug #81725 | import as json in utf8 format writes each non latin characters in \unnnn format | ||
|---|---|---|---|
| Submitted: | 4 Jun 2016 16:41 | Modified: | 5 Jun 2016 17:01 |
| Reporter: | Artem Kh. | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Workbench | Severity: | S5 (Performance) |
| Version: | 6.3.6 | OS: | Windows (Microsoft Windows 8.1 Single Language) |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | import, json, literraly, WBBugReporter | ||
[4 Jun 2016 16:44]
Artem Kh.
json utf-8 file
Attachment: countries_test.json (application/octet-stream, text), 260 bytes.
[5 Jun 2016 14:16]
MySQL Verification Team
Hello Artem Kh., Thank you for the report and test case. Verified as described with WB 6.3.6 on Win7. Thanks, Umesh
[5 Jun 2016 16:56]
Artem Kh.
Hi, good news. I found the solution. PHP help me) so... if we have inner json-object as in example field "name_translations" and use function json_encode($country->name_translations) without special option then by default all non latin symbols will convert as \uXXXX. But if we use option JSON_UNESCAPED_UNICODE it will not convert: json_encode($country->name_translations, JSON_UNESCAPED_UNICODE). please, add same opportunity to choose mode for inner objects.
[5 Jun 2016 17:01]
Artem Kh.
Also... I found bug for values in first level of json.
So, for the test we will change value of field "name" to non latin characters (from "Austria" to "Österreich"). Result:
[{"code":"AT","name":"Österreich","currency":"EUR","name_translations":{"de":"Österreich","en":"Austria","zh-CN":"奥地利","tr":"Avusturya","ru":"Австрия","fr":"Autriche","es":"Austria","it":"Austria","th":"ประเทศออสเตรีย"}}]
and will try to import record to new table.
Oppla, we will see error:
"Unhandled exception: 'ascii' codec can't encode character u'\xd6' in position 0: ordinal not in range(128)"
and in preview we will see instead of "Österreich" simple "Help" (column name).
See attached screenshot.
Hope you will close it as soon as possible.
[5 Jun 2016 17:03]
Artem Kh.
Bug for non latin values in first level of json
Attachment: Bug for non latin values in first level of json.png (image/png, text), 21.63 KiB.

Description: Import as json in utf8 format writes each non latin characters in \unnnn format. Original content of json file: [{"code":"AT","name":"Austria","currency":"EUR","name_translations":{"de":"Österreich","en":"Austria","zh-CN":"奥地利","tr":"Avusturya","ru":"Австрия","fr":"Autriche","es":"Austria","it":"Austria","th":"ประเทศออสเตรีย"}}] After import I see in the table at column "name_translations": {"ru": "\u0410\u0432\u0441\u0442\u0440\u0438\u044f", "fr": "Autriche", "en": "Austria", "de": "\u00d6sterreich", "tr": "Avusturya", "it": "Austria", "zh-CN": "\u5965\u5730\u5229", "th": "\u0e1b\u0e23\u0e30\u0e40\u0e17\u0e28\u0e2d\u0e2d\u0e2a\u0e40\u0e15\u0e23\u0e35\u0e22", "es": "Austria"} Is it possible to save values as in json "literraly" instead of "\unnnn" using Workbench? How to repeat: Original content of json file: [{"code":"AT","name":"Austria","currency":"EUR","name_translations":{"de":"Österreich","en":"Austria","zh-CN":"奥地利","tr":"Avusturya","ru":"Австрия","fr":"Autriche","es":"Austria","it":"Austria","th":"ประเทศออสเตรีย"}}]