Bug #100518 Auto-increment value in UPDATE conflicts with internally generated values
Submitted: 13 Aug 2020 15:08 Modified: 30 Aug 2020 21:18
Reporter: Paul Smith Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.7.29 OS:CentOS
Assigned to: CPU Architecture:Any

[13 Aug 2020 15:08] Paul Smith
Description:
We have an issue which may be related to Bug #70583.
We've got 2 MySQL servers running in master-master statement based replication. Both version 5.7.29.

Occasionally we get an error when running INSERT ON DUPLICATE KEY UPDATE queries that include `id` = LAST_INSERT_ID(`id`).
We run thousands of these queries each day without issue but a few times per month a single query will error, causing major issues in our production system until replication is re-synced.
We get the same error on multiple different tables but they are all the same type of queries.

Example query:-
INSERT INTO `user_guid`(
	`user_id`, `GUID`, `dtCreated`, `UUID`, `name`, `platform`, `model`, `version`, `sound`
)
VALUES(
	1234, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 1597331015, Null, 'Platform Name', 'Model Name', '0', '1.2.3', Null
)
ON DUPLICATE KEY UPDATE
	`id` = LAST_INSERT_ID(`id`),
	`user_id` = VALUES(`user_id`),
	`GUID` = VALUES(`GUID`),
	`dtCreated` = VALUES(`dtCreated`),
	`UUID` = VALUES(`UUID`),
	`name` = VALUES(`name`),
	`platform` = VALUES(`platform`),
	`model` = VALUES(`model`),
	`version` = VALUES(`version`),
	`sound` = VALUES(`sound`)

Table structure:-
CREATE TABLE `user_guid` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int(11) DEFAULT NULL,
  `GUID` varchar(500) DEFAULT NULL,
  `dtCreated` int(11) DEFAULT NULL,
  `UUID` varchar(255) DEFAULT NULL,
  `name` varchar(100) DEFAULT NULL,
  `platform` varchar(20) DEFAULT '100',
  `model` varchar(100) DEFAULT NULL,
  `version` varchar(100) DEFAULT NULL,
  `sound` tinyint(1) DEFAULT '1',
  PRIMARY KEY (`id`),
  UNIQUE KEY `uniqueGUID` (`GUID`),
  KEY `userIndex` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

The last time this happened I tried starting replication without skipping but the error remained.
I then tried running the query manually and it succeeded suggesting the query itself isn't at fault but rather the replication process had an issue.
Attempting to start replication again still failed until I skipped that query.

In Bug #70583, the MySQL Verification Team provided a command to replicate the problem:-
mysqlslap --user=root --password --delimiter=";" --create-schema=test1 --create="CREATE TABLE t(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,k INT,c CHAR(1),UNIQUE KEY(k)) ENGINE=InnoDB;INSERT INTO t(k) VALUES (1), (2), (3) ON DUPLICATE KEY UPDATE id=LAST_INSERT_ID();INSERT INTO t(k) VALUES (2), (4), (5) ON DUPLICATE KEY UPDATE id=10;" --concurrency=10 --iterations=100

I've tried running this command against 5.7.19, 5.7.29 and 8.0.21. All 3 return the error almost immediately but only on the command line.
Using this test I haven't been able to break replication at all but it's slightly different to the queries we are running.
In our case we are only inserting a single row at a time. When only inserting a single row in the mysqlslap test, I haven't seen the error.

How to repeat:
Unfortunately we haven't managed to reproduce the problem on demand.
[30 Aug 2020 21:18] MySQL Verification Team
Thank you for the bug report. To process the bug we need a repeatable test case, please when you will able to provide it, comment here. Thanks in advance.