Bug #95415 Table Data Import Wizard fails on UTF-8 encoded file with BOM
Submitted: 19 May 2019 8:05 Modified: 7 Feb 2022 10:47
Reporter: Georgi Sotirov Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Workbench Severity:S3 (Non-critical)
Version:8.0.16, 8.0.18 OS:Windows (Windows 10 Home)
Assigned to: CPU Architecture:x86

[19 May 2019 8:05] Georgi Sotirov
Description:
In MySQL Workbench 8.0.16 the Table Data Import Wizard from the context menu of a table cannot parse a UTF-8 encoded file with BOM (Byte Order Mark). Error "Can't analyze file. Please, try to change encoding type. If that doesn't help, maybe the file is not: csv, or the file is empty.". There is no such problem if the file is UTF-8 encoded without BOM.

How to repeat:
1. Crete a CSV file like the attached cities-BOM.csv with a CSV editor like Microsoft Excel 365.
2. In a test database create a table like this:

CREATE TABLE cities (
  id    INT(4)      NOT NULL,
  name  VARCHAR(64) NOT NULL,

  PRIMARY KEY (id)
)
ENGINE = InnoDB;

3. Open Workbench, find the table in Navigator, right click and select Table Data Import Wizard.

4. Select the file cities-BOM.csv and click Next. On the next page click Next.
5. When the third wizard page opens you get the error and no columns are displayed.

If you repeat the same with the attached file cities.csv (i.e. without BOM) there is no such problem - the file is parsed properly, columns are detected and data could be imported successfully.

Suggested fix:
Table Data Import Wizard should not fail on CSV files encoded in UTF-8 with BOM.
[19 May 2019 8:06] Georgi Sotirov
UTF-8 encoded CSV with BOM

Attachment: cities-BOM.csv (application/octet-stream, text), 551 bytes.

[19 May 2019 8:07] Georgi Sotirov
UTF-8 encoded CSV without BOM

Attachment: cities.csv (application/octet-stream, text), 548 bytes.

[19 May 2019 8:08] Georgi Sotirov
Wizard error

Attachment: wb_8016_data_import_error-utf8-bom.png (image/png, text), 23.65 KiB.

[20 May 2019 5:57] MySQL Verification Team
Hello Georgi Sotirov,

Thank you for the report and feedback.
Verified as described with WB 8.0.16 on Win10.

Thanks,
Umesh
[27 May 2019 10:08] Georgi Sotirov
This seems to be an old problem. It was reported more than 4 years ago for Workbench (see bug 74914). Today, I also tried LOAD DATA INFILE and I confirm the same problem, which apparently also was reported long ago (see bugs 4960 and 10573).

mysql> LOAD DATA INFILE '/var/mysql/files/cities.csv' INTO TABLE cities FIELDS TERMINATED BY ';';
Query OK, 27 rows affected (0.05 sec)
Records: 27  Deleted: 0  Skipped: 0  Warnings: 0

mysql> LOAD DATA INFILE '/var/mysql/files/cities-BOM.csv' INTO TABLE cities FIELDS TERMINATED BY ';';
ERROR 1366 (HY000): Incorrect integer value: '2700' for column 'id' at row 1
[5 Jun 2019 12:12] Rami Amiry
I have the same issue after upgrading workbench to 8.0.16. 
Is there any fix or workaround for this issue?
[24 Nov 2019 3:42] nor fatin
I have the same issue after upgrading workbench to 8.0.16. I'm using Mac v 10.14.6.
Is there any fix or workaround for this issue?
[24 Nov 2019 3:45] nor fatin
I have the same issue after upgrading MySQL Workbench to 8.0.18. I'm using Mac v 10.14.6. 
Is there any fix or workaround for this issue?
[1 Dec 2019 21:22] brandon fox
I also can confirm this is happening to me on Workbench 8.0.18. Is there a fix coming?
[19 Dec 2019 8:06] John King
I opened a new bug #98030. This should be addressed.
[19 Dec 2019 9:16] MySQL Verification Team
Bug #98030 marked as duplicate of this one
[5 Mar 2020 17:34] Dmitriy Mingaylov
Same issue in 8.0.19
[22 Apr 2020 21:17] Kareem Hutchinson
I am having the same issue on 8.0.19! Please fix this.
[10 Jun 2020 18:22] Joachim Skale
same issue on 8.0.20 build 16114288
[24 Jun 2020 12:56] Erik Sturesson
Same issue here, version 8.0.20 build 16114288
[24 Jun 2020 14:13] Erik Sturesson
For now I just changed my letters:
å = a
ä = a
ö = o

Then it worked but actually this is not a good solution. A good error message would be nice to see as well.
[24 Jun 2020 14:13] Erik Sturesson
For now I just changed my letters:
å = a
ä = a
ö = o

Then it worked but actually this is not a good solution. A good error message would be nice to see as well.
[20 Mar 2021 17:47] Igor Willems
This is a huge bug. Basically makes the file import feature useless for anything containing non basic chars. Please fix asap.
[28 Apr 2021 21:28] William Karavites
Still happens in 8.0.24.

Why is this still an issue?
[5 May 2021 7:29] John King
This is still an issue.
[17 May 2021 15:22] Adam Simonini
I'm learning MySQL atm. Based on this bug, I cannot be confident if it's my misunderstanding at play, or indeed a bug in the database. I am trying to import a list of accented French characters, and no matter how much I research UTF8, or rather "utf8mb4", I cannot solve the issue...
[27 Oct 2021 13:20] Dimitris Vayenas
Is there any progress on this issue?
[2 Jan 2022 23:02] Thiago Vasconcelos
yep still an issue, don't think it will be solved any time soon.
[6 Feb 2022 10:56] John King
This is not a huge bug just annoying. There are plenty of software that will remove a BOM from the file. I sucessfully use Notepad++, but for someone with basic C/C++/Python/AWK etc skills it would be easy to implement a filter removing the first three bytes or in vim
:set nobomb
:wq
[7 Feb 2022 10:47] Georgi Sotirov
We know the workarounds John King, but this is still a bug even if not "huge" and solely for the inconvenience to have to convert files before importing them...