Bug #83417 After updating from 5.5.50 to 5.5.52 We now get Error 1300 with LOCAL INFILE
Submitted: 17 Oct 2016 22:56 Modified: 19 Oct 2016 20:53
Reporter: Pittle Fart Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:5.5.52 OS:Ubuntu (14.04)
Assigned to: CPU Architecture:Any
Tags: utf8

[17 Oct 2016 22:56] Pittle Fart
Description:
After the update via the Ubuntu Software Centre we now get an error when loading data that worked before the update.

How to repeat:
Load data using LOAD DATA LOCAL INFILE.  All database/table charset settings are set to UTF-8.  The infile contains latin1 specific characters (0x96 and 0xAE).  When it hits these characters the load fails with the following error:
ERROR 1300 (HY000) at line 1: Invalid utf8 character string: '"census '

The load didn't error out with exact same load before the update to 5.5.52.

Suggested fix:
Return to handling this situation the same as it was before the update.
[18 Oct 2016 12:14] MySQL Verification Team
Hello  Pittle,

Thank you for the report.
Could you please provide exact repeatable test case(create table statement and load file)? You may want to mark it as private after uploading the test case here. If you are trying to insert invalid utf8 data then I would say it is expected behavior after the fix Bug #78758, and 5.5.51 change log - For multibyte character sets, LOAD DATA could fail to allocate space correctly and ignore input rows as a result. (Bug #76237, Bug #20683959, Bug #23080148)

References: This issue is a regression of: Bug #14653594

== I tried with provided dummy inputs but no issues found

mysql> create database bug default charset utf8;
Query OK, 1 row affected (0.00 sec)

mysql> use bug
Database changed
mysql> create table t1 (val text);
Query OK, 0 rows affected (0.00 sec)

mysql> LOAD DATA INFILE '/tmp/ascii.txt' INTO TABLE t1;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Deleted: 0  Skipped: 0  Warnings: 0

mysql> select * from t1;
+------+
| val  |
+------+
| 0x96 |
| 0xAE |
+------+
2 rows in set (0.00 sec)

mysql> \! cat /tmp/ascii.txt
0x96
0xAE

Thanks,
Umesh
[18 Oct 2016 19:04] Pittle Fart
To recreate the problem: 
1. Copy the provided files to some directory.
2. cd to that directory and run the following command:
   mysql -u youruser -pyourpassword -D yourdatabase --local-infile < csv_load_script.sql
[18 Oct 2016 19:05] Pittle Fart
Data

Attachment: testData.csv (text/csv), 50 bytes.

[18 Oct 2016 19:05] Pittle Fart
table definition

Attachment: test_tbl.ddl (application/octet-stream, text), 77 bytes.

[18 Oct 2016 19:06] Pittle Fart
Load Script

Attachment: csv_load_script.sql (application/sql, text), 172 bytes.

[18 Oct 2016 19:07] Pittle Fart
Oops...you will also need to create the table in the test_tbl.ddl file before running the mysql statement.
[19 Oct 2016 8:23] MySQL Verification Team
Thank you for providing the requested details, however, I'm not seeing this issue at my end on 5.5.52 or even with latest GA 5.5.53

## 5.5.52

[umshastr@hod03]/export/umesh/server/binaries/GABuilds/mysql-5.5.52: bin/mysql -uroot -S /tmp/mysql_ushastry.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.5.52-enterprise-commercial-advanced MySQL Enterprise Server - Advanced Edition (Commercial)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> use test
Database changed
mysql> Create Table test (
    -> label varchar(100)
    -> ) Engine=InnoDB Default Charset=utf8;
Query OK, 0 rows affected (0.00 sec)

mysql> \q
Bye
[umshastr@hod03]/export/umesh/server/binaries/GABuilds/mysql-5.5.52: bin/mysql -uroot -S /tmp/mysql_ushastry.sock -D test --local-infile < csv_load_script.sql
[umshastr@hod03]/export/umesh/server/binaries/GABuilds/mysql-5.5.52: bin/mysql -uroot -S /tmp/mysql_ushastry.sock test -e "select * from test"
+--------------------------------+
| label                          |
+--------------------------------+
| Diet– Coke®                    |
| "Pepsi®  Taste– Great!"

    |
+--------------------------------+

## 5.5.53

[umshastr@hod03]/export/umesh/server/binaries/GABuilds/mysql-5.5.53: bin/mysql -uroot -S /tmp/mysql_ushastry.sock -D test --local-infile < csv_load_script.sql
[umshastr@hod03]/export/umesh/server/binaries/GABuilds/mysql-5.5.53: bin/mysql -uroot -S /tmp/mysql_ushastry.sock test -e "select * from test"
+--------------------------------+
| label                          |
+--------------------------------+
| Diet– Coke®                    |
| "Pepsi®  Taste– Great!"

    |
+--------------------------------+
[19 Oct 2016 8:24] MySQL Verification Team
My charset details:

mysql> show variables like '%char%';
+--------------------------+---------------------------------------------------------------------+
| Variable_name            | Value                                                               |
+--------------------------+---------------------------------------------------------------------+
| character_set_client     | utf8                                                                |
| character_set_connection | utf8                                                                |
| character_set_database   | latin1                                                              |
| character_set_filesystem | binary                                                              |
| character_set_results    | utf8                                                                |
| character_set_server     | latin1                                                              |
| character_set_system     | utf8                                                                |
| character_sets_dir       | /export/umesh/server/binaries/GABuilds/mysql-5.5.52/share/charsets/ |
+--------------------------+---------------------------------------------------------------------+
8 rows in set (0.01 sec)

mysql> show variables like '%col%';
+---------------------------+-------------------+
| Variable_name             | Value             |
+---------------------------+-------------------+
| collation_connection      | utf8_general_ci   |
| collation_database        | latin1_swedish_ci |
| collation_server          | latin1_swedish_ci |
| protocol_version          | 10                |
| slave_compressed_protocol | OFF               |
+---------------------------+-------------------+
5 rows in set (0.00 sec)

If you can provide more information, feel free to add it to this bug and change the status back to 'Open'.

Thank you for your interest in MySQL.
[19 Oct 2016 20:53] Pittle Fart
The only difference I see is you have some of your character set variables set to latin1.  Here are my settings:
"show variables like '%char%';"
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
"show variables like '%col%';"
+---------------------------+-----------------+
| Variable_name             | Value           |
+---------------------------+-----------------+
| collation_connection      | utf8_general_ci |
| collation_database        | utf8_unicode_ci |
| collation_server          | utf8_unicode_ci |
| protocol_version          | 10              |
| slave_compressed_protocol | OFF             |
+---------------------------+-----------------+

Thanks!