Bug #88373 Renaming a column breaks replication from 5.7 to 8.0 because of impl. collation
Submitted: 6 Nov 9:39 Modified: 7 Nov 9:08
Reporter: Daniël van Eeden (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:8.0.3 OS:Any
Assigned to:

[6 Nov 9:39] Daniël van Eeden
Description:
Setup:
 - Master running 5.7.18
 - Slave running 8.0.3

Table (as shown on 5.7 by show create table):
CREATE TABLE `foobar` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `foo` varchar(100) CHARACTER SET utf8mb4 NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id_foo` (`foo`),
) ENGINE=InnoDB AUTO_INCREMENT=12345 DEFAULT CHARSET=latin1;

Then run:
ALTER TABLE `foobar` CHANGE `foo` `old_foo` varchar(100) CHARACTER SET utf8mb4 NOT NULL;

Expected result:
 - A renamed column

Actual result:
 - Replication broken with duplicate key error

Note:
 - 8.0 has ALTER TABLE .. RENAME COLUMN, but as the master runs 5.7 that's not available yet.

mysql80> show collation where `Default`='Yes' and Charset='utf8mb4';
+--------------------+---------+-----+---------+----------+---------+---------------+
| Collation          | Charset | Id  | Default | Compiled | Sortlen | Pad_attribute |
+--------------------+---------+-----+---------+----------+---------+---------------+
| utf8mb4_0900_ai_ci | utf8mb4 | 255 | Yes     | Yes      |       0 | NO PAD        |
+--------------------+---------+-----+---------+----------+---------+---------------+
1 row in set (0.00 sec)

myql57> show collation where `Default`='Yes' and Charset='utf8mb4';
+--------------------+---------+----+---------+----------+---------+
| Collation          | Charset | Id | Default | Compiled | Sortlen |
+--------------------+---------+----+---------+----------+---------+
| utf8mb4_general_ci | utf8mb4 | 45 | Yes     | Yes      |       1 |
+--------------------+---------+----+---------+----------+---------+
1 row in set (0.00 sec)

The show create table output on 8.0 shows utf8mb4_general_ci is used on this column. That's the 5.7 default (so hidden from show create table output), but is not the default on 8.0.

As no collation is specified the 8.0.3 machine applies the default (utf8mb4_0900_ai_ci) which is different to the upstream master.

The 'fix' is to use:
- SET GLOBAL collation_server = 'utf8mb4_general_ci';
- SET PERSIST collation_server = 'utf8mb4_general_ci';

It would be nice if this would work out of the box.
Also this should be mentioned on https://dev.mysql.com/doc/refman/8.0/en/replication-upgrade.html

How to repeat:
See description.

Suggested fix:
1. Put the default collation in the binlogs for alter statements.
2. Always show the collation even if it is the default (helps to create a MODIFY COLUMN statement without side-effects)
3. Detect that the statement comes from a 5.7 master and apply 5.7 defaults.
4. Make the default
[6 Nov 12:41] Umesh Shastry
Hello Daniël,

Thank you for the report and feedback.
I tried to reproduce issue using 5.7.18/5.7.20(master - non-gtid with default settings) -> 8.0.3(slave - non-gtid with default settings) with provided table, later by altering column on master(before that added few rows in order to break - I followed to see if it helps in reproducing duplicate issues http://mysqlserverteam.com/mysql-8-0-collations-the-devil-is-in-the-details/).

Could you please provide exact data(subset of data in INSERT.. format) which could help me trigger this issue at my end? Thank you!

Thanks,
Umesh
[6 Nov 15:01] Daniël van Eeden
looks like my fix doesn't really work..

I can't share the real data but it looked similar to this:
foobar@example.com (0x666F6F626172406578616D706C652E636F6D)
foobar@example.comً  (0x666F6F626172406578616D706C652E636F6DD98B)
[6 Nov 15:11] Daniël van Eeden
Also related to:
Bug #80792 	ALTER TABLE should support ALGORITHM=METADATA

(this is a metadata only change on the 5.7 master, but not on the 8.0 slave)
[7 Nov 6:34] Shane Bester
For point (2):

https://bugs.mysql.com/bug.php?id=46239 
(mysql doesn't show what collation was used if that collation is the default)
[7 Nov 9:08] Umesh Shastry
Thank you, finally observed this at my end.

i) 5.7.18(master) -> 8.0.3(slave) - I'm not seeing any issues i.e ALTER happily replicates etc no issues.
ii) 5.7.18(master) -> 5.7.18(slave), upgrade slave to 8.0.3 and then ALTER on master triggers broken replication

              Last_SQL_Error: Error 'Duplicate entry 'foobar@example.comً' for key 'id_foo'' on query. Default database: 'test'. Query: 'ALTER TABLE `foobar` CHANGE `foo` `old_foo` varchar(100) CHARACTER SET utf8mb4 NOT NULL'

Thanks,
Umesh
[7 Nov 9:22] Umesh Shastry
test results

Attachment: 88373_5.7.18_8.0.3.results (application/octet-stream, text), 18.45 KiB.

[15 Dec 12:17] Pedro Figueiredo
Posted by developer:
 
The described behaviour is also identified when applying above provided steps to a single server.

Steps to reproduce
------------------
1) Run mysqld with version 5.7
2) Create a database
	
    CREATE DATABASE collation_test;
    USE collation_test;

3) Create table

    CREATE TABLE `foobar` (
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `foo` varchar(100) CHARACTER SET utf8mb4 NOT NULL,
    PRIMARY KEY (`id`),
    UNIQUE KEY `id_foo` (`foo`)
    ) ENGINE=InnoDB AUTO_INCREMENT=12345 DEFAULT CHARSET=latin1;

4) Insert provided data

    INSERT INTO foobar (foo) 
           VALUES (0x666F6F626172406578616D706C652E636F6D);
    INSERT INTO foobar (foo) 
           VALUES (0x666F6F626172406578616D706C652E636F6DD98B);
 
5) Upgrade mysqld to version 8.0, attaching the 8.0 mysqld to 
   the existent setup and running 'mysql_upgrade' for that 
   instance, e.g.

   $ mysql_upgrade -u root --socket=/tmp/mysqld.sock

8) Change the column with the statement

    ALTER TABLE `foobar` CHANGE `foo` `old_foo` varchar(100) 
    CHARACTER SET utf8mb4 NOT NULL;

Expected result
---------------
The expected result would be the changed row (behaviour verified
with no upgrade):

    mysql> ALTER TABLE `foobar` CHANGE `foo` `old_foo` varchar(100)
           CHARACTER SET utf8mb4 NOT NULL;
    Query OK, 0 rows affected (0.02 sec)

Obtained result
---------------
The actual result is an error due to duplicate keys:

    mysql> ALTER TABLE `foobar` CHANGE `foo` `old_foo` varchar(100)
           CHARACTER SET utf8mb4 NOT NULL;                 
    ERROR 1062 (23000): Duplicate entry 'foobar@example.comً' for key 'id_foo'