Bug #31609 Not all RBR slave errors reported as errors
Submitted: 15 Oct 2007 13:50 Modified: 6 Feb 2008 12:57
Reporter: Lars Thalmann Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Row Based Replication ( RBR ) Severity:S3 (Non-critical)
Version:5.1 OS:Any
Assigned to: Andrei Elkin CPU Architecture:Any

[15 Oct 2007 13:50] Lars Thalmann
Description:
Due to idempotency requirements, RBR does not report error
on some duplicate key/key does not exists failures.

How to repeat:
Code inspection.

Suggested fix:
RBR needs to, per default, report these as errors and have an option
to turn these checks off.
[17 Oct 2007 13:19] Lars Thalmann
See also BUG#31552.
[29 Oct 2007 12:40] Lars Thalmann
I had a talk with Tomas and we think the simplest solution is a
dynamic system variable:

  slave-rbr-mode=[idempotent | strict]  

with default 'strict' setting.

Benefits:

+ One can use this also for other storage engines when one wants idempotency.
+ Easy to test
+ Having a "mode" makes it possible for us to set a family of internal 
  options at once
+ It is east to use
+ Adding new modes is easy

Drawback:

- Telco/cluster customers need to set the option when upgrading
  (But since it is dynamic, it should be easy to do).

For the future (later versions), Tomas and I think that cluster might
want to add a flag for transaction that they should be run
idempotently.  Not all epochs need to be executed idempotently for
cluster.
[2 Nov 2007 19:16] Lars Thalmann
These two things need to be fixed before closing this bug.

Documentation
=============
We need to clarify in manual the conditions when idempotency of log is
attainable and can be used:

1. Primary keys are needed to get idempotency.

2. Primary keys can't be changed

Example 1:
----------
Consider a binlog that changes the primary key of a row in this way:
(2->3, 1->2) and that the initial data is a record 1.

Then running binlog once gives one record 2.  Running it twice gives
another record 3.

Example 2:
----------
Consider a binlog that changes the primary key of a row in this way:
(2->3, 1->2) and that the initial data is (1,2).

Then running binlog once gives (2,3).  Running it twice gives 
(3,3) = (3), i.e only one record is left.

Cluster 
=======
We need to have verified that these conditions are fulfilled in NDB,
since NDB will have the idempotency flag enabled.
[2 Nov 2007 19:59] Lars Thalmann
Tomas has confirmed that primary key change is always logged as a
DELETE plus an INSERT.

Taking the 2nd example above, we then get:

First apply of binlog
---------------------
Assume two initial records: (1,2).

Assume two updates: (2->3, 1->2).

Then in binlog this becomes:

1. Delete 2.
2. Insert 3.
3. Delete 1.
4. Insert 2.

The result is then two records: (2,3).

Second apply of binlog
----------------------
Running this binlog on (2,3) gives:

1. Delete 2.  Ok.
2. Insert 3.  "Error: Duplicate key".  Ignored.
3. Delete 1.  "Error: Can't find record".  Ignored.
4. Insert 2.  Ok.

Result (2,3).

So all should work fine.  This should always work in the context we
work in, but anyone having a formal proof that this will always work,
is encouraged to provide such a proof. :)
[20 Nov 2007 16:04] Andrei Elkin
The patch is within changeset for Bug #31552.
[27 Nov 2007 10:50] Bugs System
Pushed into 5.1.23-rc
[27 Nov 2007 10:53] Bugs System
Pushed into 6.0.4-alpha
[29 Nov 2007 1:50] Trudy Pelzer
D1/I2
[5 Feb 2008 13:04] Bugs System
Pushed into 5.1.24-rc
[5 Feb 2008 13:08] Bugs System
Pushed into 6.0.5-alpha
[6 Feb 2008 12:57] Jon Stephens
Documented bugfix in 5.1.24/6.0.5 changelogs; documented new slave_exec_mode system variable; see http://lists.mysql.com/commits/41771 for details.
[6 Mar 2008 9:56] Jon Stephens
Also documented for 5.1.23-ndb-6.2.14.
[30 Mar 2008 20:34] Jon Stephens
Also documented for 5.1.23-ndb-6.3.11.