Bug #25507 multi-row insert delayed + auto increment causes duplicate key entries on slave
Submitted: 9 Jan 2007 22:46 Modified: 21 Jun 2010 0:44
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:4.1-bk,5.0.34 OS:Any (*)
Assigned to: Guilhem Bichot CPU Architecture:Any
Tags: auto increment, bfsm_2007_01_18, duplicate key, insert delayed

[9 Jan 2007 22:46] Shane Bester
Description:
when using multi-row inserts into tables with auto-increment columns, replication breaks if more than 1 concurrent connection inserts into tables on the master.

the testcase.c spawns 2 threads, and causes the slave to give errors:

070110  0:40:44 [ERROR] Slave: Error 'Duplicate entry '75103' for key 1' on query. Default database: 'test'. Query: 'insert delayed into t1(name) values ('a'),('b'),('c'),('d'),('e'),('f'),('g'),('h'),('i'),('j')', Error_code: 1062
070110  0:40:45 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'hti1-bin.000001' position 1397384

How to repeat:
compile and run testcase.c after changing host, user, port, etc.

Suggested fix:
.
[9 Jan 2007 22:47] MySQL Verification Team
testcase to spawn 2 threads inserting, will break the replication

Attachment: testcase.c (text/plain), 4.08 KiB.

[17 Jan 2007 5:47] MySQL Verification Team
setup replication with no special replication options, other than the bare minimum server-id and log-bin, and direct the testcase.c to point to the master server.

observe the errors on the slave after a few seconds of running the testcase.
[6 Feb 2007 14:25] Guilhem Bichot
So the problem is that when two concurrent multi-row INSERT DELAYED run, their rows can be intermixed in the delayed_insert system thread's queue. For example, considering CREATE TABLE t (a INT AUTO_INCREMENT PRIMARY KEY, b CHAR(1)),
if connection1 does INSERT DELAYED VALUES(NULL,"v"),(NULL,"x");
and at the same time connection2 does INSERT DELAYED VALUES(NULL,"y"),(NULL,"z");
then the queue may look like v,y,x,z, thus the table will contain:
1 v
2 y
3 x
4 z.
Which is wrong. Because the binlog will contain
{INSERT_ID=1, INSERT DELAYED VALUES(NULL,"v"),(NULL,"x");}
{INSERT_ID=2, INSERT DELAYED VALUES(NULL,"y"),(NULL,"z");}
(binlogging in INSERT DELAYED happens when the first row of a statement succeeds).
So, on slave, the table will contain
1 v
2 x
and the second statement will fail (duplicate key value "2").

SIMPLE EASY SOLUTION

the one which would also fix BUG#26116 (see that bug report).

LONG SOLUTION

Statement-based binlogging cannot be changed, so it's the queue which must be changed so that rows of a multi-row statement are always contiguous in the queue: add a mutex to the delayed_insert class; when calling write_delayed() in a multi-row INSERT DELAYED, make sure to lock the mutex for first row and unlock it for last row thus ensuring that the rows of the statement are contiguous.
Drawback: if the rows contain slow operations (like SLEEP(), GET_LOCK()...) then the mutex will be held for a long time preventing other threads from queuing their rows.
[9 Feb 2007 14:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/19619

ChangeSet@1.2411, 2007-02-09 16:48:07+01:00, guilhem@gbichot3.local +3 -0
  Fix for BUG#25507 "multi-row insert delayed + auto increment causes
  duplicate key entries on slave" (two concurrrent connections doing
  multi-row INSERT DELAYED to insert into an auto_increment column,
  caused replication slave to stop with "duplicate key error" (and
  binlog was wrong), and BUG#26116 "If multi-row INSERT
  DELAYED has errors, statement-based binlogging breaks" (the binlog
  was not accounting for all rows inserted, or slave could stop).
  The fix is that: in statement-based binlogging, a multi-row INSERT
  DELAYED is silently converted to a non-delayed INSERT.
  This is supposed to not affect many 5.1 users as in 5.1, the default
  binlog format is "mixed", which does not have the bug (the bug is
  only with binlog_format=STATEMENT).
  Bug will be fixed in 5.0 but I first submit a patch for 5.1 because:
  - 5.1 has the test framework (mysqlslap) needed to repeat the bug,
  and 5.0 does not (so, 5.0's patch will have no testcase)
  - the code fix of 5.1 is the more complex (5.0's code fix is the same
  after simplification because in 5.0 we only have statement-based
  binlogging).
  We should document how the system delayed_insert thread decides of
  its binlog format (which is not modified by this patch):
  this decision is taken when the thread is created
  and holds until it is terminated (is not affected by any later change
  via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog
  format of the connection which issues INSERT DELAYED (this binlog
  format does not affect how the row will be binlogged).
  If one wants to change the binlog format of its server with SET
  GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all
  delayed_insert threads terminate and thus new threads are created,
  taking into account the new format.
  Because of this, this patch allows itself to test the global
  binlog_format to decide if INSERT DELAYED should be non-delayed,
  and to test it without its mutex LOCK_global_system_variables
  (which is interesting, as the condition is rarely true in 5.1 so
  better avoid the mutex)
[14 Feb 2007 17:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/19900

ChangeSet@1.2411, 2007-02-14 18:47:39+01:00, guilhem@gbichot3.local +6 -0
  Fix for BUG#25507 "multi-row insert delayed + auto increment causes
  duplicate key entries on slave" (two concurrrent connections doing
  multi-row INSERT DELAYED to insert into an auto_increment column,
  caused replication slave to stop with "duplicate key error" (and
  binlog was wrong), and BUG#26116 "If multi-row INSERT
  DELAYED has errors, statement-based binlogging breaks" (the binlog
  was not accounting for all rows inserted, or slave could stop).
  The fix is that: in statement-based binlogging, a multi-row INSERT
  DELAYED is silently converted to a non-delayed INSERT.
  This is supposed to not affect many 5.1 users as in 5.1, the default
  binlog format is "mixed", which does not have the bug (the bug is
  only with binlog_format=STATEMENT).
  Bug will be fixed in 5.0 but I first submit a patch for 5.1 because:
  - 5.1 has the test framework (mysqlslap) needed to repeat the bug,
  and 5.0 does not (so, 5.0's patch will have no testcase)
  - the code fix of 5.1 is the more complex (5.0's code fix is the same
  after simplification because in 5.0 we only have statement-based
  binlogging).
  We should document how the system delayed_insert thread decides of
  its binlog format (which is not modified by this patch):
  this decision is taken when the thread is created
  and holds until it is terminated (is not affected by any later change
  via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog
  format of the connection which issues INSERT DELAYED (this binlog
  format does not affect how the row will be binlogged).
  If one wants to change the binlog format of its server with SET
  GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all
  delayed_insert threads terminate and thus new threads are created,
  taking into account the new format.
[15 Feb 2007 13:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/19943

ChangeSet@1.2394, 2007-02-15 15:39:03+01:00, guilhem@gbichot3.local +5 -0
  Fix for BUG#25507 "multi-row insert delayed + auto increment causes
  duplicate key entries on slave" (two concurrrent connections doing
  multi-row INSERT DELAYED to insert into an auto_increment column,
  caused replication slave to stop with "duplicate key error" (and
  binlog was wrong)), and BUG#26116 "If multi-row INSERT
  DELAYED has errors, statement-based binlogging breaks" (the binlog
  was not accounting for all rows inserted, or slave could stop).
  The fix is that: if (statement-based) binlogging is on, a multi-row
  INSERT DELAYED is silently converted to a non-delayed INSERT.
  Note: it is not possible to test BUG#25507 in 5.0 (requires mysqlslap),
  so it is tested only in the changeset for 5.1. However, BUG#26116
  is tested here, and the fix for BUG#25507 is the same code change.
[15 Feb 2007 18:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/19958

ChangeSet@1.2412, 2007-02-15 20:28:58+01:00, guilhem@gbichot3.local +13 -0
  Manual merge from 5.0-rpl, of fixes for:
  1)
    BUG#25507 "multi-row insert delayed + auto increment causes
    duplicate key entries on slave" (two concurrrent connections doing
    multi-row INSERT DELAYED to insert into an auto_increment column,
    caused replication slave to stop with "duplicate key error" (and
    binlog was wrong), and BUG#26116 "If multi-row INSERT
    DELAYED has errors, statement-based binlogging breaks" (the binlog
    was not accounting for all rows inserted, or slave could stop).
    The fix is that: in statement-based binlogging, a multi-row INSERT
    DELAYED is silently converted to a non-delayed INSERT.
    This is supposed to not affect many 5.1 users as in 5.1, the default
    binlog format is "mixed", which does not have the bug (the bug is
    only with binlog_format=STATEMENT).
    We should document how the system delayed_insert thread decides of
    its binlog format (which is not modified by this patch):
    this decision is taken when the thread is created
    and holds until it is terminated (is not affected by any later change
    via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog
    format of the connection which issues INSERT DELAYED (this binlog
    format does not affect how the row will be binlogged).
    If one wants to change the binlog format of its server with SET
    GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all
    delayed_insert threads terminate and thus new threads are created,
    taking into account the new format.
  2)
    BUG#24432
    "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values".
    When in an INSERT ON DUPLICATE KEY UPDATE, using
    an autoincrement column, we inserted some autogenerated values and
    also updated some rows, some autogenerated values were not used
    (for example, even if 10 was the largest autoinc value in the table
    at the start of the statement, 12 could be the first autogenerated
    value inserted by the statement, instead of 11). One autogenerated
    value was lost per updated row. Led to exhausting the range of the
    autoincrement column faster.
    Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12.
    This bug breaks replication from a pre-5.0.24/pre-5.1.12 master.
    But the present bugfix, as it makes INSERT ON DUP KEY UPDATE
    behave like pre-5.0.24/pre-5.1.12, breaks replication from a
    [5.0.24,5.0.34]/[5.1.12,5.1.15]
    master to a fixed (5.0.36/5.1.16) slave! To warn users against this when
    they upgrade their slave, as agreed with the support team, we add
    code for a fixed slave to detect that it is connected to a buggy
    master in a situation (INSERT ON DUP KEY UPDATE into autoinc column)
    likely to break replication, in which case it cannot replicate so
    stops and prints a message to the slave's error log and to SHOW SLAVE
    STATUS.
    For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15]
    replication we cannot warn as master
    does not know the slave's version (but we always recommended to users
    to have slave at least as new as master).
    As agreed with support, I have asked for an alert to be put into
    the MySQL Network Monitoring and Advisory Service.
  3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets
    the changes from the main 5.1.
[15 Feb 2007 20:19] Guilhem Bichot
queued to 5.0-rpl, 5.1-rpl
[8 Mar 2007 8:25] Andrei Elkin
pushed to 5.0.38,5.1.17-beta
[9 Mar 2007 14:18] MC Brown
A note has been added tot he 5.0.38 and 5.1.17 changelogs.
[27 Aug 2007 14:19] [ name withheld ]
Any chance for a backport to 4.1 for this patch? We're experiencing it on 4.1.21 as well.
[28 Aug 2007 14:32] Guilhem Bichot
Hi,
unfortunately 4.1 is too old to receive this change. If you are ready to build 4.1 from source, you can backport the simple patch here:
http://lists.mysql.com/commits/19943
to your 4.1's source (sql_insert.cc is the file which matters) and it should do.
[14 Sep 2009 11:30] Abhishek Singh
I am using MySQL 5.1.34 on master and on slave also but still i am facing issue related with Multi-row insert on slave. It throws Duplicate entry error while on master have binlog-format=mixed and i am using Innodb on both the server.
[5 May 2010 15:03] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 2:54] Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[28 May 2010 5:59] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:28] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:56] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 May 2010 2:58] Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[17 Jun 2010 12:02] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:44] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:29] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)