Bug #8563 Replication of InnoDB table fails with Error_code: 1062 intermittently
Submitted: 17 Feb 2005 6:30 Modified: 17 Feb 2005 19:50
Reporter: Garth Patil Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:4.1.9 OS:Linux (Linux (RHEL 3, x86))
Assigned to: CPU Architecture:Any

[17 Feb 2005 6:30] Garth Patil
Description:
I have two instances of 4.1.9 running on two identical machines. they are configured with the recommended my-large.cnf files included in the distribution, with the appropriate fields uncommented for replication and InnoDB tables.
I have one application writing synchronously over one JDBC connection to the following table:
CREATE TABLE `systemQuery` (
  `queryId` int(10) unsigned NOT NULL auto_increment,
  `MMPQueryId` int(10) unsigned NOT NULL default '0',
  `dateCreated` datetime NOT NULL default '0000-00-00 00:00:00',
  `dateModified` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
  `umda` varchar(75) default NULL,
  `queryString` text NOT NULL,
  `userId` int(10) unsigned default NULL,
  `firstTimeUser` tinyint(3) unsigned NOT NULL default '0',
  `channelId` int(10) unsigned NOT NULL default '0',
  `carrierId` int(10) unsigned default NULL,
  `success` tinyint(3) unsigned NOT NULL default '0',
  `staticResponse` tinyint(3) unsigned NOT NULL default '0',
  `advertisementId` int(10) unsigned default NULL,
  `outputServiceId` int(10) unsigned NOT NULL default '0',
  `timeReceived` timestamp NOT NULL default '0000-00-00 00:00:00',
  `timeSent` timestamp NOT NULL default '0000-00-00 00:00:00',
  `processingTime` int(10) unsigned NOT NULL default '0',
  `locationId` int(10) unsigned default NULL,
  `brokerGuid` varchar(35) default NULL,
  `sentStatus` varchar(35) NOT NULL default '',
  `serviceCode` varchar(20) default NULL,
  PRIMARY KEY  (`queryId`),
  KEY `userId` (`userId`),
  KEY `channelId` (`channelId`),
  KEY `carrierId` (`carrierId`),
  KEY `advertisementId` (`advertisementId`),
  KEY `locationId` (`locationId`),
  CONSTRAINT `systemQuery_ibfk_1` FOREIGN KEY (`userId`) REFERENCES `user` (`userId`),
  CONSTRAINT `systemQuery_ibfk_2` FOREIGN KEY (`channelId`) REFERENCES `channel` (`channelId`),
  CONSTRAINT `systemQuery_ibfk_3` FOREIGN KEY (`carrierId`) REFERENCES `carrier` (`carrierId`),
  CONSTRAINT `systemQuery_ibfk_4` FOREIGN KEY (`advertisementId`) REFERENCES `advertisement` (`advertisementId`),
  CONSTRAINT `systemQuery_ibfk_5` FOREIGN KEY (`locationId`) REFERENCES `location` (`locationId`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

The replication slave will run fine for several minutes, up to several days, but fails intermittently when sending an insert to the table above. The following error appears in the slave database's error log:

050216 21:58:17 [ERROR] Slave: Error 'Duplicate entry '5651' for key 1' on query. Default database: 'transaction'. Query: 'INSERT INTO systemQuery (MMPQueryId,
queryString, umda, brokerGuid, sentStatus, success, staticResponse, timeReceived, timeSent, processingTime, firstTimeUser, userId, channelId, carrierId, outputServiceId, dateCreated) VALUES (413,'AKS','web:sess:F42955251D55393341ED6F2BA3AE89BB','','SUCCESS',1,0,'2005-02-16 17:40:17','2005-02-16 17:40:17',18.0,1,1354,5,999,5,NOW())', Error_code: 1062
050216 21:58:17 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'dbmaster-bin.000001' position 536730297

How to repeat:
Create the table given, and run inserts on it until it fails (like the insert from the error log).

I've also been able to recreate the problem another way:
-Create this table on the master db:
CREATE TABLE `ts` (
  `tsId` int(10) unsigned NOT NULL auto_increment,
  `val` varchar(75) default NULL,
  PRIMARY KEY  (`queryId`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
-Run this query against the master db many times (I wrote a simple bash script to do it):
INSERT INTO ts (val) VALUES ('blah');

The slave will eventually fail with the same error as above. Sometimes it takes a few minutes, sometimes several days.

Suggested fix:
I don't know.
[17 Feb 2005 7:48] Heikki Tuuri
Garth,

please examine the binlog with mysqlbinlog and find out what query inserted the duplicate value.

The binlog should contain statements like:

SET INSERT_ID=7657644;
INSERT INTO ... VALUES ...;

Does the binlog contain the same insert id for two inserts?

Regards,

Heikki
[17 Feb 2005 7:57] Heikki Tuuri
Garth,

also, show your my.cnf for the master and the slave.

Regards,

Heikki
[17 Feb 2005 8:07] Heikki Tuuri
Garth,

also check in the slave that the duplicate row indeed exists there. Otherwise, this could be a bug in the InnoDB duplicate key detection in the slave.

Regards,

Heikki
[17 Feb 2005 18:46] Garth Patil
Heikki,
My apologies for this. As I agonized over this problem someone in my office fessed up to making a write to the slave database. This is not a bug.
Garth
[17 Feb 2005 19:50] Sinisa Milivojevic
Not a bug, according to user.