Description:
I encounter a duplicate entry on key 1 when replication is running, and this is happening because slaves 'jump' one or two auto_increment values at a time, when they should not.
How to repeat:
I have a table, named ACCOUNTS, with a structure like this :
CREATE TABLE `ACCOUNTS` (
`REFA` int(11) NOT NULL auto_increment,
`DATEEX datetime default NULL,
`REFB int(10) default NULL,
`REFC` int(10) default NULL,
`REFP` int(10) default NULL,
`PROD` varchar(100) NOT NULL default '',
`DATEO` datetime default NULL,
`CODEO` varchar(4) default NULL,
`REFO` varchar(12) default NULL,
`INSERTDATE` timestamp(14) NOT NULL,
PRIMARY KEY (`REFA`),
KEY `REFC` (`REFC`),
KEY `CODEO` (`CODEO`),
KEY `REFO` (`REFO`),
KEY `DATEO` (`DATEO`)
) TYPE=MyISAM;
The master table has more than 235000 records and it is working perfectly fine. I recently altered the table to destroy the primary key and rebuilt it using an ALTER TABLE with an ORDER BY option to order the records using DATEO. So my auto_increment key has no problem and there is no 'hole' or 'jump' between 2 keys.
There is only one way to insert datas in this table, and it's done every 15 minutes by a schedule VBS. This VBS is made on purpose to synchronise different tables, and sort of merge them in ACCOUNTS. This script works like this, and uses manually created temporary tables:
--------
1> I drop, if it exists, my temporary table, tTEMP
2> I create a temporary table, named tTEMP, with one field, indexed, name REFO. The table type is HEAP.
3> I insert into this table all the REFO from table ACCOUNT with CODEO in ('ZORD', 'XORD')
4> I drop, if exists, another temporary table, tTEMP2
5> I create tTEMP2 with a field name REFF, indexed. Table type is HEAP.
6> I fill tTEMP2 with and (INSERT INTO ... SELECT) that takes datas from another table, 'S_F' :
INSERT INTO tTEMP2 (REFF) SELECT DISTINCTROW S_F.REFFACTURE FROM S_F LEFT JOIN tTEMP AS A_C ON S_F.REFF = A_C.REFOPE WHERE S_F.DATEV >= DATE_SUB(CurDate(), INTERVAL 1 MONTH) AND S_F.REFB <> 5 AND S_F.DATEA Is NOT NULL AND (A_C.REFO IS NULL)
It's done like this not to insert into tTEMP2 values that are already in ACCOUNTS.
7> At this point, I have in tTEMP2 the REFF of things I want inserted in my ACCOUNTS table, and I do my final insert :
INSERT INTO ACCOUNTS (....) SELECT (.....) FROM ('a lot of inner joins of several tables datas are coming from').
------
As said above, it's working great on the master server.
Now I have set up 2 replication slaves, on 2 other servers.
- I stopped the master.
- I copied the datas from the master to the slaves
- I restarted the master
- I started the slaves with all the replication options and parameters I want, and it's working fine, until something awful happens.
Something like 2 hours after replication started and was working great, there is, at one time, a problem.
As I said, my VBS script is running every 15 minutes and thus worked about 7 or 8 times before the crash, inserting datas without any problem. But the last one created a 'jump' in the primary key (REFA) on both slaves, at the very same time, with the very same key.
See, on the master, it's like that in the datas :
REFA REFB REFC .....
237088 ABC 123
237087 DEF 456
237086 GHI 789
237085 KLM 012
237084 NOP 345
Where on slaves it's like this :
REFA REFB REFC .....
237089 ABC 123
237088 DEF 456
237086 GHI 789
237084 KLM 012
237083 NOP 345
And then, in the error log on both slaves, I have an error saying :
040715 4:17:46 Slave I/O thread: Failed reading log event, reconnecting to retry, log 'dataserver-bin.044' position 196161
ERROR: 1062 Duplicate entry '237088' for key 1
Strange because it shouldn't happen, and because it's not even the last record in the slave tables...
Please note that no inserts are done in the table but with the VBS, and that SELECT only are made. It's happening only with this table, and at random time after the slave is restarted. Slaves are only used for replication/backup machines, nothing else.
I'm using mySQL 4.0.13 on windows 2K server. Slaves and masters are same version.