Bug #40251 Replication failure on RBR + Innodb + PARTITIONs
Submitted: 22 Oct 2008 13:56 Modified: 20 Apr 2009 17:14
Reporter: Philip Stoev Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Row Based Replication ( RBR ) Severity:S2 (Serious)
Version:5.1-rpl OS:Any
Assigned to: Alfranio Tavares Correia Junior CPU Architecture:Any

[22 Oct 2008 13:56] Philip Stoev
Description:
When executing a concurrent transactional workload with simple insert/update/delete against a very simple partitioned table, row-based replication fails as follows:

Could not execute Update_rows event on table test.table10_innodb_key_pk_parts_2_int_autoinc; Can't find record in 'table10_innodb_key_pk_parts_2_int_autoinc', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master-bin.000001, end_log_pos 84535

In other test runs, the master and the slave diverge with some records only present on the slave.

Sometimes the non-partitioned table in the test will be the one affected. The problem may not be related to partitioning per se, but to the fact that query deadlocks in partitioning are not resolved via the deadlock detector, but with a timeout.

How to repeat:
A test case will be uploded shortly.
[22 Oct 2008 13:58] Philip Stoev
YY file

Attachment: bug40251.yy (application/octet-stream, text), 381 bytes.

[22 Oct 2008 13:59] Philip Stoev
ZZ file

Attachment: bug40251.zz (text/plain), 194 bytes.

[22 Oct 2008 14:27] Philip Stoev
To reproduce with the random query generator, clone a fresh copy of the mysql-test-extra-6.0 tree and then execute:

$ cd mysql-test-extra-6.0/mysql-test/gentest
$ perl runall.pl \
  --basedir=/path/to/5.1-rpl \
  --grammar=bug40251.yy \
  --gendata=bug40251.zz \
  --rpl_mode=row \
  --engine=Innodb \
  --mysqld=--innodb-lock-wait-timeout=1
[22 Oct 2008 14:34] Valeriy Kravchuk
The bug is NOT repeatable with main mysql-5.1 tree.
[22 Oct 2008 14:36] Philip Stoev
Valeriy , can you please try 5.1-rpl as well?
[22 Oct 2008 16:17] Valeriy Kravchuk
With 5.1-rpl I've got:

Invalid option "--start-and-exit". I remember some bug report about that, but is there any workaround for your random query generator?
[22 Oct 2008 16:19] Philip Stoev
Oh, sorry about that -- please copy the entire mysql-test tree from 5.1-bzr to 5.1-rpl in order to make this work.
[23 Nov 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[23 Nov 2008 8:43] Philip Stoev
The -rpl tree  now has a working --start-and-exit. Please try the original case as described in the bug. If it does not work, please copy the entire mysql-test from the -main tree and try again.

Thank you.
[5 Dec 2008 7:53] Sveta Smirnova
backtrace

Attachment: bug40251.txt (text/plain), 25.16 KiB.

[5 Dec 2008 7:53] Sveta Smirnova
Thank you for the feedback.

In my case both 5.1 and 5.1-rpl trees crash with backtrace attached. Please inform if this happens in your case too and/or should it be separate bug report.
[6 Jan 2009 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[6 Jan 2009 7:48] Philip Stoev
The crash is bug 41543. This bug is about the replication failure.
[20 Apr 2009 17:14] Alfranio Tavares Correia Junior
Philip and I could not repeat it.

Closing as "cannot repeat".