Bug #54201 "SET INSERT_ID" event must be ignored if corresponding event is ignored
Submitted: 3 Jun 2010 12:18 Modified: 27 Apr 2012 14:23
Reporter: Andrii Nikitin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.1.47 OS:Any
Assigned to: Sven Sandberg CPU Architecture:Any

[3 Jun 2010 12:18] Andrii Nikitin
Description:
Slave may get "Duplicate Key" error on following conditions:

1. STATEMENT replication
2. Some trigger inserts into table with AUTO_INC only on slave. (E.g. no trigger on master)
3. Partial replication (e.g. replicate-ignore-table or similar option is set).

Following sequence in binary log, will lead to the 'Duplicate key' error:

1. SET INSERT_ID=10000;
2. insert into table_ignored_by_slave values(NULL);
3. update table_with_trigger_on_slave set column1=1;

If slave ignores event #2 (because of replicate-do* option), it should also ignore event #1. Otherwise trigger on slave will pick INSERT_ID from event #1 and set auto_inc column to unrelated value (i.e. most possible "Duplicate Key" error).

How to repeat:
Master:

master ((none)) > use test
Database changed
master (test) > show tables;
Empty set (0.00 sec)

master (test) > create table t1(s varchar(10));
Query OK, 0 rows affected (0.00 sec)

master (test) > create table t2(id int auto_increment primary key) engine=myisam;
Query OK, 0 rows affected (0.00 sec)

Slave (configured to only replicate table test.t1): - Replicate_Wild_Do_Table: test.t1

slave1 ((none)) > create table test.slave_only(id int auto_increment primary key) engine=myisam;
Query OK, 0 rows affected (0.03 sec)

slave1 ((none)) > use test
Database changed

slave1 (test) > show tables;
+----------------+
| Tables_in_test |
+----------------+
| slave_only     | 
| t1             | 
+----------------+
2 rows in set (0.01 sec)

slave1 (test) > insert into slave_only values(NULL);
Query OK, 1 row affected (0.00 sec)

slave1 (test) > show create table slave_only\G
*************************** 1. row ***************************
       Table: slave_only
Create Table: CREATE TABLE `slave_only` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=2 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

slave1 (test) > create trigger t1_update after update on t1 for each row insert into slave_only values(NULL);
Query OK, 0 rows affected (0.00 sec)

Master:

master (test) > insert into t2 values(NULL);
Query OK, 1 row affected (0.00 sec)

master (test) > insert into t1 values('s');
Query OK, 1 row affected (0.00 sec)

master (test) > update t1 set s='s1';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

Slave gets 'Duplicate Key' error:

slave1 [localhost] {msandbox} (test) > show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: msandbox
                  Master_Port: 23552
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 616
               Relay_Log_File: mysql_sandbox23553-relay-bin.000011
                Relay_Log_Pos: 678
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: test.t1
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1062
                   Last_Error: Error 'Duplicate entry '1' for key 'PRIMARY'' on query. Default database: 'test'. Query: 'update t1 set s='s1''
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 533
              Relay_Log_Space: 929
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1062
               Last_SQL_Error: Error 'Duplicate entry '1' for key 'PRIMARY'' on query. Default database: 'test'. Query: 'update t1 set s='s1''
1 row in set (0.00 sec)

Binlog contains following events (note that event at position 326 should be ignored because corresponding statement is ignored by slave):

# at 199
#100603  9:55:48 server id 1  end_log_pos 326 	Query	thread_id=5	exec_time=0	error_code=0
SET TIMESTAMP=1275551748/*!*/;
create table t2(id int auto_increment primary key) engine=myisam
/*!*/;
# at 326
#100603 10:01:42 server id 1  end_log_pos 354 	Intvar
SET INSERT_ID=1/*!*/;
# at 354
#100603 10:01:42 server id 1  end_log_pos 444 	Query	thread_id=5	exec_time=0	error_code=0
SET TIMESTAMP=1275552102/*!*/;
insert into t2 values(NULL)
/*!*/;
# at 444
#100603 10:02:05 server id 1  end_log_pos 527 	Query	thread_id=5	exec_time=0	error_code=0
SET TIMESTAMP=1275552125/*!*/;
insert into t1 values('s')
/*!*/;
# at 533
#100603 10:02:21 server id 1  end_log_pos 616 	Query	thread_id=5	exec_time=0	error_code=0
SET TIMESTAMP=1275552141/*!*/;
update t1 set s='s1'

Suggested fix:
1. Put INSERT_ID into event which it corresponds to
or 
2. Reset INSERT_ID if any event was ignored
[3 Jun 2010 12:19] Andrii Nikitin
workaround: 
INSERT IGNORE into dummy table (with auto_inc) at the beginning of trigger body. This will reset INSERT_ID value
[29 Jun 2010 13:44] Andrei Elkin
Sven, I'd like to follow up discussion we held today.

In my opinion, a solution should follow a natural principle which is 

In the case of filtered main query (2) the SQL thread execution context shall remain as it was before executing {1,2} group.
That is as if the group was not tried executing (I withdrew an extreme possibility for the context to be as if the group has been executed successfully).

Now to its implementation, first of all, the immediate issue of the auto-inc Intvar is possible because
the filtered-out (2) Query did not perform the auto-inc context cleanup (Query_log_event::do_apply_event was missed out).
That hits on deploying that part of the method onto the positive filtering branch.
There are two more ralative subcases 
(1b) Rand event 
(1c) User var event

While there should be no difficulties with regard to (1b) to reset random seeds at the end of the positive filtering branch, (1c) might require
storing the old value of the user var to restore it once the main query (2)
has been found to filtering out.
[29 Jun 2010 15:43] Andrei Elkin
More explicitly, 
`as if ... was not tried executing' - as if the group was skipped via
@@global.slave_sql_skip_counter.
[27 Apr 2012 14:23] Jon Stephens
Fixed in 5.1+. Documented as follows in the 5.1.64, 5.5.25, and 5.6.6 changelogs:

        Statements using AUTO_INCREMENT, LAST_INSERT_ID(), RAND(), or
        user variables could be applied in the wrong context on the
        slave when using statement-based replication and replication
        filtering server options (see
        <xref linkend="replication-rules"/>).

Closed.
[14 Aug 2013 9:02] Atul Mohindru
Hi Jonathan,

We are facing similar kind of issue on MySQL 5.1.70. We have setup of one master and two slaves (Second slave is slave of first slave).

We are replicating the Auto_increment_id using LAST_INSERT_ID function.

Following is the table structure:
Table: log
+------------+------------+------+-----+---------------------+----------------+
| Field      | Type       | Null | Key | Default             | Extra          |
+------------+------------+------+-----+---------------------+----------------+
| log_id     | bigint(20) | NO   | PRI | NULL                | auto_increment |
| session_id | bigint(20) | NO   | MUL | 0                   |                |
| agent_id   | int(11)    | NO   | MUL | 0                   |                |
| logtime    | datetime   | NO   | MUL | 0000-00-00 00:00:00 |                |
| postdate   | datetime   | NO   |     | 0000-00-00 00:00:00 |                |
+------------+------------+------+-----+---------------------+----------------+

Table: away_reason
+----------------+------------+------+-----+---------+-------+
| Field          | Type       | Null | Key | Default | Extra |
+----------------+------------+------+-----+---------+-------+
| log_id         | bigint(20) | NO   | PRI | 0       |       |
| session_id     | int(11)    | NO   | MUL | 0       |       |
| away_reason_id | int(11)    | NO   |     | 0       |       |
+----------------+------------+------+-----+---------+-------+

INSERT INTO log (agent_id, logtime) VALUES (1, NOW());

INSERT INTO calllog_away_reason (calllog_id, away_reason_id, vcc_session_id) VALUES (LAST_INSERT_ID(), 1, 1);

Second query replication failing on second slave and getting following error message: "Error 'Duplicate entry '0' for key 'PRIMARY'' on query"

Please suggest the solution.

Thanks,
Atul Mohindru
[14 Aug 2013 9:04] Atul Mohindru
Please ignore second query in above comment. Correct query is
INSERT INTO away_reason (log_id, away_reason_id, vcc_session_id) VALUES (LAST_INSERT_ID(), 1, 1);
[26 Aug 2015 20:25] Carroll Miller
Rather than documenting to "don't do this", could you put in this fix?

In log_event.cc, there are if statements to check if a transaction should be applied based on filter criteria:
  if (rpl_filter->db_ok(thd->db().str))
  {
    …
    if (rpl_filter->is_on() && !rpl_filter->tables_ok(thd->db().str, &tables))
    {
      // TODO: this is a bug - this needs to be moved to the I/O thread
      if (net)
        skip_load_data_infile(net);
    }
    else
    {
                …
    }
  }
  else
  {
    /*
      We will just ask the master to send us /dev/null if we do not
      want to load the data.
      TODO: this a bug - needs to be done in I/O thread
    */
    if (net)
      skip_load_data_infile(net);
  }

The transaction prior to this might have been a SET INSERT_ID = xxxx
If that was true, we have that value stored and ready to hit anything that uses an identity column in a trigger on the slave for a different table.  All that needs to be done is that in the place you see the TODO comments add the following:
thd->auto_inc_intervals_forced.empty();

That would clear out the override of any auto_increment value and it should be cleared because we have decided to skip the next transaction which should have been the reason it was even set.