Bug #51233 import from sql-file with utf-8 charset does not see utf-8
Submitted: 17 Feb 2010 10:13 Modified: 25 May 2010 15:39
Reporter: Peter Veger Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Workbench: Administration Severity:S3 (Non-critical)
Version:5.2.15 OSS Beta rev5053 OS:Any
Assigned to: CPU Architecture:Any
Tags: rubbish, sql-8, workbench

[17 Feb 2010 10:13] Peter Veger
Description:
When importing utf-8 file in Query browser, it imports correctly (e.g. é as é), when importing via workbench incorrectly (e.g. é as two characters)

How to repeat:
import any utf-8 sql-file with non-ascii characters..
[17 Feb 2010 12:42] Susanne Ebrecht
Peter,

can you give me a screenshot here from the workbench result.
[18 Feb 2010 18:56] Peter Veger
see attached files
[19 Feb 2010 10:47] Susanne Ebrecht
Peter,

sorry, charset issues usually need a deeper analysis.

I need to know some more stuff from you:

1) on which OS is server running
2) Did you create table and insert statement via Workbench or did you use the command line client?

If you used command line client I would need to know on which OS it is running

3) It would be a great help for me if you already could give me result from the following outputs:

SHOw VARIABLES LIKE '%char%';

SHOW CREATE TABLE boten;

SELECT Bootnaam, LENGTH(Bootnaam),HEX(Bootnaam) FROM boten WHERE BID=357;

SELECT Haven, LENGTH(Haven), HEX(Haven) FROM boten WHERE BID=357;
[20 Feb 2010 10:35] Peter Veger
1. Versions
hosting-ISP server: WinNT5.2.3790, MySQL4.1.22-community-nt, phpMyAdmin 2.6.0-pl3;
own system: Win7Ult64 8.1.7600, MySQL 5.1.40-community, Workbench 5.2.15 OSS Beta 5053, Query Browser 1.2.17

2. Charsets
ISP and local: everywhere charset:utf8, collation:uft8_general_ci;

3. Export from ISP to local
for each table: drop table if exists, create table ... charset=utf-8, insert...

4. Content Exported file:
all checking programs (UltraEdit, BabelPAD) indicate and show correct utf-8

5. Import through Workbench: KO

SHOW VARIABLES LIKE '%char%';
'character_set_client', 'utf8'
'character_set_connection', 'utf8'
'character_set_database', 'utf8'
'character_set_filesystem', 'binary'
'character_set_results', 'utf8'
'character_set_results', 'utf8'
'character_set_results', 'utf8'
'character_set_results', 'utf8'
'character_set_results', 'utf8'
'character_sets_dir', 'C:\Program Files\MySQL\MySQL Server 5.1\share\charsets\'

SHOW CREATE TABLE boten;
'boten', 'CREATE TABLE `boten` (
  `BID` int(11) NOT NULL DEFAULT ''0'',
  `Bootnaam` char(25) NOT NULL DEFAULT '''',
  `LigPlaats` char(25) DEFAULT NULL,
  `Land` char(30) NOT NULL DEFAULT '''',
  `Haven` char(25) DEFAULT NULL,
  `WerfofType` char(30) DEFAULT NULL,
  `KleurRomp` char(17) NOT NULL DEFAULT ''W'',
  `KleurZeilen` char(15) NOT NULL DEFAULT ''W'',
  `KleurSpinnaker` char(15) DEFAULT NULL,
  `ZeilTeken` char(12) DEFAULT NULL,
  `Kiel` char(4) NOT NULL DEFAULT ''vk'',
  `Klapschroef` char(10) NOT NULL DEFAULT ''geen'',
  `LOA` smallint(6) NOT NULL DEFAULT ''0'',
  `LWL` smallint(6) NOT NULL DEFAULT ''0'',
  `BOA` smallint(6) NOT NULL DEFAULT ''0'',
  `H` smallint(6) NOT NULL DEFAULT ''0'',
  `D` smallint(6) NOT NULL DEFAULT ''0'',
  `D2` smallint(6) NOT NULL DEFAULT ''0'',
  `Wtot` smallint(6) NOT NULL DEFAULT ''0'',
  `Wkl` smallint(6) NOT NULL DEFAULT ''0'',
  `Grz` float NOT NULL DEFAULT ''0'',
  `Gen` float NOT NULL DEFAULT ''0'',
  `Wijzn` char(64) NOT NULL DEFAULT '''',
  `SWhandicap` float DEFAULT NULL,
  `Handicap` float DEFAULT NULL,
  `MMSI` int(11) DEFAULT NULL,
  `MarifoonCallSign` char(6) DEFAULT NULL,
  `MarifoonZeevaart` tinyint(4) NOT NULL DEFAULT ''0'',
  `Radar` tinyint(4) NOT NULL DEFAULT ''0'',
  `GPS` tinyint(4) NOT NULL DEFAULT ''0'',
  `AIS` tinyint(4) NOT NULL DEFAULT ''0'',
  KEY `BID` (`BID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8'

SELECT Bootnaam, LENGTH(Bootnaam), HEX(Bootnaam) FROM boten WHERE BID=357;
'Den Røde Båd', '18', '44656E2052C383C2B864652042C383C2A564'

SELECT Haven, LENGTH(Haven), HEX(Haven) FROM boten WHERE BID=357;
'Port Zélande', '15', '506F7274205AC383C2A96C616E6465'

6.Execute as script file in Query Browser: OK
SHOW VARIABLES LIKE '%char%';
see above

SHOW CREATE TABLE boten;
see above

SELECT Bootnaam, LENGTH(Bootnaam), HEX(Bootnaam) FROM boten WHERE BID=357;
'Den Røde Båd', '14', '44656E2052C3B864652042C3A564'

SELECT Haven, LENGTH(Haven), HEX(Haven) FROM boten WHERE BID=357;
'Port Zélande', '13', '506F7274205AC3A96C616E6465'
[23 Feb 2010 8:50] Susanne Ebrecht
Many thanks for your feedback.

MySQL server 4.1 run out of support already years ago. MySQL Workbench only support server 5.1 and higher.

Anyway, I am not able to follow you on which servers you did the select:

SELECT Bootnaam, LENGTH(Bootnaam), HEX(Bootnaam) FROM boten WHERE BID=357;
'Den Røde Båd', '18', '44656E2052C383C2B864652042C383C2A564'

SELECT Haven, LENGTH(Haven), HEX(Haven) FROM boten WHERE BID=357;
'Port Zélande', '15', '506F7274205AC383C2A96C616E6465'

These data are stored wrong in the database. Could be related to an old bug that we have had in 4.1 and 5.0 and which we fixed in 5.1.

SELECT Bootnaam, LENGTH(Bootnaam), HEX(Bootnaam) FROM boten WHERE BID=357;
'Den Røde Båd', '14', '44656E2052C3B864652042C3A564'

SELECT Haven, LENGTH(Haven), HEX(Haven) FROM boten WHERE BID=357;
'Port Zélande', '13', '506F7274205AC3A96C616E6465'

These data are stored correct in the database.
[23 Feb 2010 13:08] Peter Veger
I exported, from hosting ISP server with rather old software, as sql-file in utf-8.
I then used only my own system (with absolutely uptodate latest software: Win7, MySQL server, Workbench, Query Browser).

Import of utf-8 file through Workbench gave KO results:
'Den Røde Båd', '18', '44656E2052C383C2B864652042C383C2A564' and
'Port Zélande', '15', '506F7274205AC383C2A96C616E6465'

Import of the same file through Query Browser gave OK results:
'Den Røde Båd', '14', '44656E2052C3B864652042C3A564'
'Port Zélande', '13', '506F7274205AC3A96C616E6465'
[5 Mar 2010 10:21] Susanne Ebrecht
I think it is because your file is latin1 and not utf8.

You need to convert your file first to utf8 before uploading.
[8 Mar 2010 13:45] Peter Veger
hexadecimal view

Attachment: UEdit0.PNG (image/png, text), 7.28 KiB.

[8 Mar 2010 13:46] Peter Veger
direct view

Attachment: UEdit1.PNG (image/png, text), 5.28 KiB.

[8 Mar 2010 13:51] Peter Veger
It is utf-8:
see UEdit-files.
See extract from BabelPad:
'Den Røde Båd'
'Port Zélande'
One and the same file works through Query Browser, not through Workbench.
Is Query Browser clairvoyant? Workbench blind?
One and the same file!
[10 Mar 2010 12:09] Susanne Ebrecht
Many thanks for your feedback.

MySQL Server 5.0 has had a problem with double encoding utf8 data.

This bug is fixed in younger servers ... anyway it still was in 5.0.51a.

Also MySQL server 5.0 is already out of support.

http://www.mysql.com/about/legal/lifecycle/

Please test if you have same problems by using MySQL server 5.1.
[10 Mar 2010 19:23] Peter Veger
As I already indicated in my message of 20 feb:
I used the following versions on my own system:
Win7Ult64 8.1.7600, MySQL 5.1.40-community, Workbench 5.2.15 OSS Beta 5053,
Query Browser 1.2.17
[25 Apr 2010 15:39] Sveta Smirnova
Thank you for the feedback.

I can not repeat described behavior. Please try with version 5.2.19 and if problem still exists add query SET NAMES utf8 in the beginning of the file, try again and inform us if problem still repeatable for you.
[25 May 2010 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[8 Feb 2022 13:21] Valerio Messina
import dialog encoding

Attachment: MySQL_Workbench_8.0.28_import.png (image/png, text), 16.03 KiB.

[8 Feb 2022 13:25] Valerio Messina
I'm on Win10 with MySQL Workbench 8.0.24 64 bit

I have a table defined as so:
Charset/Collation: utf8   utf8_bin
Engine: InnoDB
with some fields defined with Datatypes: VARCHAR(1024)

Importing a csv file saved as UTF-8 (with or without BOM), always results in garbage for non ASCII chars, independent of Import Dialog encoding settings, see attach.

Importing a csv file saved as ANSI (checked with Notepad), always results in right character for non ASCII chars, independent of Import Dialog encoding settings, see attach.
[10 Feb 2022 17:31] Valerio Messina
link to: https://bugs.mysql.com/bug.php?id=95700
[16 Mar 2022 12:51] Valerio Messina
any news on UTF-8 support?
[11 Apr 2022 9:41] Valerio Messina
it seems very important to me that the Workbench is able to import files in UTF-8 format correctly.
We are in 2022 and the UTF-8 format is now the dominant format, not only among Unicode encodings, but also with respect to the ASCII format and the old ANSI, Windows-1252, and ISO-8859-1 codepages, practically only the one for text files:
https://en.wikipedia.org/wiki/File:UTF-8_takes_over.png

So please fix this issue with extended / multi-byte characters, it will require little effort, as Windows supports UTF-8 encoding from Windows XP onwards, and on Linux it is the default from the beginning.

Tested with the
Workbench Version 8.0.28 build 1474738 CE (64 bits)