Description:
Replication with additional column on source should not be allowed as it can cause serious data loss on replica. Replica is used as a backup and for HA. When the replica becomes source, this can cause a huge issue with data loss and even if we restore a backup from replica. It even does not notify in error logs. Not even a warning or note about it. So it is impossible to find out differences other than performing checksum.
How to repeat:
Hello MySQL Team,
I found a very strange behaviour of replication. Additionally that behaviour is allowed as per the documentation as well.
Here is the document,
https://dev.mysql.com/doc/refman/8.4/en/replication-features-differing-tables.html
The document mentions that,
More columns on the source. The following table definitions are valid and replicate correctly:
source> CREATE TABLE t1 (c1 INT, c2 INT, c3 INT);
replica> CREATE TABLE t1 (c1 INT, c2 INT);
Same behaviour can be reproduced on testing as well,
On master I have this table structure
mysql> show create table t1\G
*************************** 1. row ***************************
Table: t1
Create Table: CREATE TABLE `t1` (
`id` int NOT NULL AUTO_INCREMENT,
`name` varchar(20) DEFAULT NULL,
`last` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)
On Replica I have,
mysql> show create table t1\G
*************************** 1. row ***************************
Table: t1
Create Table: CREATE TABLE `t1` (
`id` int NOT NULL AUTO_INCREMENT,
`name` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.01 sec)
I added additional column (last) on source, with sql_log_bin=off. That is why it is not present on replica.
I run the insert statement on Source now
mysql> insert into t1 values (2,'yunus1','test');
Query OK, 1 row affected (0.01 sec)
Lets see Source and replica
Source
mysql> select * from t1;
+----+--------+------+
| id | name | last |
+----+--------+------+
| 1 | yunus | NULL |
| 2 | yunus1 | test |
+----+--------+------+
2 rows in set (0.00 sec)
Replica
mysql> select * from t1;
+----+--------+
| id | name |
+----+--------+
| 1 | yunus |
| 2 | yunus1 |
+----+--------+
2 rows in set (0.00 sec)
We lost the data for column last on the Replica without even knowing that there is a data loss. Replication works fine. It is documented similarly how it behaves if the columns are in proper order.
But the question is why it is allowed to have more columns on Source? This is causing a data loss silently.
Replica is used for reads, backups and HA. All the cases, this is a huge problem.
Backup will not have additional column data.
Reads won't be having proper data.
HA also wont work well as if we promote the replica to source, the source will have data loss.
Overall this is a problem with data-loss.
This needs more details on how it is allowed and if this behaviour is good then how to solve the above problems in that case.
Suggested fix:
More columns on Source should not be allowed.
If it is allowed on replica, thats fine. Replica can have more columns but source should not have.