Bug #81727 Warn It's Unsafe to Execute LOA DATA without Current DB
Submitted: 5 Jun 2016 8:10
Reporter: Mikiya Okuno Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.7.12 OS:Any
Assigned to: CPU Architecture:Any

[5 Jun 2016 8:10] Mikiya Okuno
Description:
On slave SQL thread, a collation retrieved from binlog event from master is used for a database collation if no database is set as a current database. It will cause a collation/charset mismatch for LOAD DATA command on master and slave. IMHO, LOAD DATA should warn it's unsafe if following conditions are met:

* no current database is set.
* binlog format is statement.
* no explicit character set is set.

How to repeat:
* Setup a replication
* Connect to a master without current db
* Create table and populate it, then dump it to a file

mysql> CREATE DATABASE test;
mysql> CREATE TABLE test.t (a SERIAL, b VARCHAR(100)) CHARACTER SET utf8mb4;
mysql> INSERT INTO test.t (b) VALUES('こんにちは');
mysql> SELECT * INTO OUTFILE 't.txt' FROM test.t;
mysql> TRUNCATE test.t;

* Change server collation temporarily

mysql> SET collation_server = cp932_japanese_ci;
mysql> SET collation_database = utf8mb4_general_ci;

I set collation_database just for explanation here. I know it's deprecated. If it's not set, global level collation_database is used instead.

* Set binlog format to statement

mysql> SET binlog_format = statement;

* Execute LOAD DATA

mysql> LOAD DATA INFILE 't.txt' INTO TABLE test.t;

Now, you will find that the slave is stopped due to an charset error.

Suggested fix:
In sql/sql_db.cc:

bool load_db_opt(THD *thd, const char *path, HA_CREATE_INFO *create)
{
  File file;
  char buf[256];
  DBUG_ENTER("load_db_opt");
  bool error=1;
  uint nbytes;

  memset(create, 0, sizeof(*create));
  create->default_table_charset= thd->variables.collation_server; <====

This is a source of the problem. Setting collation server as table's default character set isn't follow a usual behavior.
[5 Jun 2016 8:18] Mikiya Okuno
Suggested fix #2 is to set LOAD DATA statement unsafe for statement format if no current database is set and not explicit character set is set within LOAD DATA command.