MySQL Bugs: #51055: Replication failure on duplicate key + traditional SQL mode

Bug #51055	Replication failure on duplicate key + traditional SQL mode
Submitted:	10 Feb 2010 10:43	Modified:	21 Jun 2010 0:53
Reporter:	Philip Stoev	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S3 (Non-critical)
Version:	mysql-5.1	OS:	Any
Assigned to:	Luis Soares	CPU Architecture:	Any

Description:
When attempting to execute a problematic statement under TRADITIONAL SQL mode, replication fails as follows:

100210 13:36:07 [ERROR] 121 when fixing table
100210 13:36:07 [Warning] Warning: Enabling keys got errno 121 on test.t1_base1_N, retrying
100210 13:36:07 [ERROR] Slave SQL: Error 'Duplicate entry '2' for key 'PRIMARY'' on query. Default database: 'test'. Query: 'INSERT INTO t1_base1_N SELECT * FROM test.table10_int_autoinc', Error_code: 1062
100210 13:36:07 [ERROR] Slave SQL: Fatal error: ... The slave SQL is stopped, leaving the current group of events unfinished with a non-transaction table changed. If the group consists solely of Row-based events, you can try restarting the slave with --slave-exec-mode=IDEMPOTENT, which ignores duplicate key, key not found, and similar errors (see documentation for details). Error_code: 1593

[philips@fedora10 mysql-test]$ ../extra/perror 121
OS error code 121:  Remote I/O error
MySQL error code 121: Duplicate key on write or update

How to repeat:
--source include/master-slave.inc
--disable_abort_on_error

CREATE TABLE t1 (
        pk integer auto_increment ,
        primary key (pk)
);

SET SESSION SQL_MODE='traditional';

INSERT INTO t1 (`pk`) VALUES (1), (1);

--sync_slave_with_master

Thank you for the bug report.

On mysql-5.1, the following error occurs:

100222 12:30:07 [ERROR] Slave SQL: Error 'Duplicate entry '1' for key 'PRIMARY'' on query. Default database: 'test'. Query: 'INSERT INTO t1 (`pk`) VALUES (1), (1)', Error_code: 1062

On next-mr, no errors whatsoever.

Please feel free to close the bug if this is the expected situation.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/101702

3352 Luis Soares	2010-02-26
      BUG#51055: Replication failure on duplicate key + traditional SQL
      mode
      
      When the master was executing in sql_mode='traditional', the
      error code (ER_DUP_ENTRY in the reported case) was not being set
      in the Query_log_event. Therefore, even if a failure was to be
      expected when replaying the statement on the slave, a failure
      would occur, because the Query_log_event was not transporting the
      expected error code, but 0 instead.
      
      This was because when the master was getting the error code to
      set it in the Query_log_event, the executing thread would be
      assumed to have been killed:
      THD::killed==THD::KILL_BAD_DATA. This would make the error code
      fetch routine not to check thd->main_da.sql_errno(), but instead
      the thd->killed value. What's more, is that the server would
      thd->killed value if thd->killed == THD::KILL_BAD_DATA and return
      0 instead. So this is a double inconsistency, as the we should
      not even check thd->killed but rather thd->main_da.sql_errno().
      
      We fix this by extending the condition used to choose whether to
      check the thd->main_da.sql_errno() or thd->killed, so that it
      takes into consideration the case when:
      thd->killed==THD::KILL_BAD_DATA.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/102266

3352 Luis Soares	2010-03-04
      BUG#51055: Replication failure on duplicate key + traditional SQL
      mode
      
      When the master was executing in sql_mode='traditional' (which
      implies that really_abort_on_warning returns TRUE - because of
      MODE_STRICT_ALL_TABLES), the error code (ER_DUP_ENTRY in the
      reported case) was not being set in the
      Query_log_event. Therefore, even if a failure was to be expected
      when replaying the statement on the slave, a failure would occur,
      because the Query_log_event was not transporting the expected
      error code, but 0 instead.
      
      This was because when the master was getting the error code to
      set it in the Query_log_event, the executing thread would be
      assumed to have been killed:
      THD::killed==THD::KILL_BAD_DATA. This would make the error code
      fetch routine not to check thd->main_da.sql_errno(), but instead
      the thd->killed value. What's more, is that the server would
      thd->killed value if thd->killed == THD::KILL_BAD_DATA and return
      0 instead. So this is a double inconsistency, as the we should
      not even check thd->killed but rather thd->main_da.sql_errno().
      
      We fix this by extending the condition used to choose whether to
      check the thd->main_da.sql_errno() or thd->killed, so that it
      takes into consideration the case when:
      thd->killed==THD::KILL_BAD_DATA.

Pushed into 5.5.4-m3 (revid:alik@sun.com-20100326080914-2pz8ns984e0spu03) (version source revid:alexey.kopytov@sun.com-20100320202342-3oapaq7r0t6qhexq) (merge vers: 5.5.3-m2) (pib:16)

Pushed into mysql-next-mr (revid:alik@sun.com-20100326081116-m3v4l34yhr43mtsv) (version source revid:alik@sun.com-20100325072612-4sds00ix8ajo1e84) (pib:16)

Pushed into 6.0.14-alpha (revid:alik@sun.com-20100326081944-qja07qklw1p2w7jb) (version source revid:alik@sun.com-20100325073410-4t4i9gu2u1pge7xb) (merge vers: 6.0.14-alpha) (pib:16)

Documented bugfix in the 5.5.4 and 6.0 14 changelogs as follows:

        If the master was using sql_mode='TRADITIONAL', duplicate key
        errors were not sent to the slave, which received 0 rather than
        the expected error code. This caused replication to fail even
        when such an error was expected.

Set NM - waiting for 5.1

5.5.4 changelog entry was moved to 5.5.5.

Pushed into 5.1.46 (revid:sergey.glukhov@sun.com-20100405111026-7kz1p8qlzglqgfmu) (version source revid:martin.hansson@sun.com-20100316162138-u9724fhm54cj3or0) (merge vers: 5.1.46) (pib:16)

Also documented fix in the 5.1.46 changelog.

Closed.

Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:martin.skold@mysql.com-20100609211156-tsac5qhw951miwtt) (merge vers: 5.1.46-ndb-6.2.19) (pib:16)

Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)