Bug #106113 Contribution by Tencent: insert on duplicate key cause 1032 replication error in
Submitted: 8 Jan 2022 6:52 Modified: 8 Jan 2022 7:39
Reporter: Xiaodong Huang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:8.0.27, 5.7.36 OS:Any
Assigned to: CPU Architecture:x86
Tags: Contribution

[8 Jan 2022 6:52] Xiaodong Huang
Description:
When the variable slave_rows_search_algorithms= "TABLE_SCAN,INDEX_SCAN,HASH_SCAN”, By using "insert on duplicate key update" , updating same row more then one times will cause 1032 replication error. The https://bugs.mysql.com/bug.php?id=105802  is the same problem.

How to repeat:
The testcase is as following:

source include/master-slave.inc;
source include/have_binlog_format_row.inc;
connection master;
CREATE TABLE `t1` (`id` int(11) DEFAULT NULL, `test` varchar(255) DEFAULT 'hallo', UNIQUE KEY `id` (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `t1` VALUES (1,'foo'),(2,'test'),(3,'lorem');

insert into t1 values(1, 'test'), (1, 'foo'), (1, 'bar') on duplicate key update test=values(test);
source include/sync_slave_sql_with_master.inc;
connection slave;
select * from t1;

connection master;
drop table t1;
source include/sync_slave_sql_with_master.inc;
source include/rpl_end.inc;

The opt file is -slave-rows-search-algorithms='INDEX_SCAN,HASH_SCAN'

Suggested fix:
In the testcase, the statement "insert into t1 values(1, 'test'), (1, 'foo'), (1, 'bar') on duplicate key update test=values(test);" will generate a update_rows_log_event that consist with three update_row_log which is shown in the graph below. In the graph, the before image is the record to be updated and the after image is the record after it is updated.

| update row log | before image(BI) | after image(AI) |
| -------------- | ---------------- | --------------- |
| 1              | (1, 'foo')       | (1, 'test')     |
| 2              | (1, 'test')      | (1, 'foo')      |
| 3              | (1, 'foo')       | (1, 'bar')      |

In order to apply the event, SQL thread will first generate a hash table which is

crc32((1, 'foo’))-->[(1, 'foo') , (1, 'bar’)]-->[(1, 'foo') , (1, 'test’)]

crc32((1, 'test’))-->[(1, 'test') , (1, 'foo')]

Then, it will first find the update row log  [(1, 'foo') , (1, 'bar’)] ,and execute it. so the record (1, 'foo')  will be changed to (1, 'bar’). In next step, the update row log [(1, 'foo') , (1, 'test’)] and [(1, 'test') , (1, 'foo’)] will can’t find the record (1, 'foo') and report 1032 error.

In general, the  commit https://github.com/mysql/mysql-server/commit/dfd8d7d89bfb7cbde7c0e45827bcb611dcb8f36e fix the problem that the one update_rows_log_event have more then one update row log that update at same record e,g. the update_rows_log_event have the row log that update the record A to A-->B-->C-->D. But is don’t consider the situation in which the update_rows_log_event is A-->B-->A-->B.

This patch was contributed by my colleague poempeng and me.
[8 Jan 2022 6:55] Xiaodong Huang
The patch:

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: 0001-Fix-1032-replication-error-cased-by-inserting-on-dup.patch (application/octet-stream, text), 2.70 KiB.

[8 Jan 2022 6:58] Xiaodong Huang
The solution is apply logs in the order in which it appear in the event.
[8 Jan 2022 7:39] MySQL Verification Team
Hello Xiaodong,

Thank you for the report and contribution.

regards,
Umesh