Bug #80150 Restoring a dump of binary column, made with mysqldump, warns of invalid string
Submitted: 26 Jan 2016 9:05 Modified: 13 Apr 2018 13:36
Reporter: Guilhem Bichot Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: mysqldump Command-line Client Severity:S3 (Non-critical)
Version:5.6.28,5.7.11 OS:Any
Assigned to: CPU Architecture:Any

[26 Jan 2016 9:05] Guilhem Bichot
Description:
Detached from bug#79317.

Repeated with 5.7.11 (by me), and 5.6.28 (by user). Likely present in the latest 5.5 too.
Consequence of the fix for Bug#20238729
( https://github.com/mysql/mysql-server/commit/33a2e5abd ) .

I used a BINARY column but BLOB would show it as well:

create table t(a binary(16));
insert into t values(x'EE0C6D03C34C11E5B1640026B977EB17');
select hex(a) from t;
EE0C6D03C34C11E5B1640026B977EB17

All correct, now dump it:

$ ../client/mysqldump -uroot -S var/tmp/mysqld.1.sock test >/tmp/dump.sql

$ cat /tmp/dump.sql 
-- MySQL dump 10.13  Distrib 5.7.12, for Linux (x86_64)
...
/*!40101 SET NAMES utf8 */;
...
INSERT INTO `t` VALUES ('\<EE>^Lm^C\<C3>L^Q\<E5><B1>d\0&<B9>w\<EB>^W');

So, raw bytes are in the dump (like ^L).
Now, load this dump with MySQL command-line client ('mysql --show-warnings'):
mysql> source /tmp/dump.sql
...
Warning (Code 1300): Invalid utf8 character string: 'EE0C6D'

The content is correctly inserted, fortunately:
mysql> select hex(a) from t;
EE0C6D03C34C11E5B1640026B977EB17

So the problem is this warning during restore: it is scaring for users.

NB: I imagine mysqlpump could also be affected.

Workaround: run mysqldump with option --hex-blob. It might also possible to alter an existing dump by replacing "SET NAMES UTF8" with "SET NAMES BINARY", but I'm not sure of all consequences so I discourage it.

How to repeat:
see above.

Suggested fix:
Mysqldump could print "_binary" in front of strings which a data of a binary column. Or make --hex-blob the only possible behaviour. Or determine if SET NAMES BINARY is a good idea (for example see http://dev.mysql.com/doc/refman/5.7/en/charset-connection.html ).
[26 Jan 2016 10:44] MySQL Verification Team
Thank you for the bug report.

........

Query OK, 1 row affected, 1 warning (0.11 sec)

Warning (Code 1300): Invalid utf8 character string: 'EE0C6D'
Query OK, 0 rows affected (0.00 sec)

........

Query OK, 0 rows affected (0.00 sec)

mysql> show variables like "%version%";
+-------------------------+---------------------------------------+
| Variable_name           | Value                                 |
+-------------------------+---------------------------------------+
| innodb_version          | 5.7.12                                |
| protocol_version        | 10                                    |
| slave_type_conversions  |                                       |
| tls_version             | 1                                     |
| version                 | 5.7.12-debug                          |
| version_comment         | Source distribution PULL: 2016-JAN-14 |
| version_compile_machine | x86_64                                |
| version_compile_os      | Win64                                 |
+-------------------------+---------------------------------------+
8 rows in set (0.00 sec)
[26 Apr 2016 13:24] Rob Wagner
Confirmed this occurs in latest 5.5, as well.  --hex-blob is a valid workaround.

mysql> select version();
+-------------------------------------------+
| version()                                 |
+-------------------------------------------+
| 5.5.49-enterprise-commercial-advanced-log |
+-------------------------------------------+
1 row in set (0.00 sec)
[7 Mar 2017 12:38] Frank Ullrich
This is scaring indeed! 
Cost me 2 days of blood, sweat and tears before I discovered this bug report!
We are on 5.7.17 meanwhile and the bug still exists ...

Furthermore: I disagree on the conclusion. 
--hex-blob is not a reliable workaround for me because an import of one particular database dump (there could be more - I don't know yet) caused the popular 'MySQL has gone away' symptom.
Whereas the Code-1300-warning-throwing dump ran through.
[7 Mar 2017 13:11] Guilhem Bichot
Posted by developer:
 
Because this "false warning" occurs during backup restoration, which is already a stressful moment for a DBA, and because a user just recently had this problem and said it cost him two days, I ask for re-triage. Moreover, I think that a possible fix (making mysqldump print "_binary" in front of binary strings) is easy to implement for us.
[7 Mar 2017 14:14] Guilhem Bichot
Posted by developer:
 
"mysql has gone away" is usually a symptom of a crash; if --hex-blob does that it's serious (but we would then need a testcase) and makes it not a workaround, indeed.
[7 Mar 2017 14:17] MySQL Verification Team
with hex-blob the dump size is much larger. set max-allowed-packet to 1G on both server and client...
[20 Feb 2018 23:23] Andrew Wozniewicz
Confirming that this bug also/still exists on 5.5. It should be fixed. We have wasted significant effort on chasing it.
[7 Mar 2018 8:55] Guilhem Bichot
Can get this problem with GIS column too. Verified with trunk server and clients:

create table geom(a geometry);
# valid query taken from
# https://dev.mysql.com/doc/refman/5.7/en/populating-spatial-columns.html
INSERT INTO geom VALUES (ST_GeomFromText('POINT(1 1)'));

now:
mysqldump -uroot -S /m/tmp/zo/sock test >/tmp/dump.sql

now:

mysql> source /tmp/dump.sql
...
INSERT INTO `geom` VALUES ('\0\0\0\0\0\0\0\0\0\0\0\0\0\�?\0\0\0\0\0\0\�?')
--------------

Query OK, 1 row affected, 1 warning (0,05 sec)

Warning (Code 1300): Invalid utf8mb4 character string: 'F03F00'
[7 Mar 2018 10:53] Guilhem Bichot
GIS user got the problem:
https://forums.mysql.com/read.php?20,654893,654893
[28 Mar 2018 9:26] Guilhem Bichot
We're working on fixing this in 5.7 and 8.0.
[13 Apr 2018 13:36] Paul DuBois
Posted by developer:
 
Fixed in 5.7.23, 8.0.12.

If mysqldump or mysqlpump were used to dump binary data without the
--hex-blob option, reloading the dump file could produce spurious
warnings (values were inserted correctly regardless of the warnings).
Such values are now written preceded by the _binary introducer to
silence the warnings.