Bug #28430 Failure in replication of innodb partitioned tables on row/mixed format
Submitted: 15 May 2007 1:56 Modified: 21 Jun 2010 0:42
Reporter: Omer Barnir (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.1.19 OS:Any
Assigned to: Alexey Botchkov CPU Architecture:Any

[15 May 2007 1:56] Omer Barnir
Description:
Missing updates observed on the slave when replicating innodb partoiopmed 
tables using rbr/mbr modes.
Details:
Tables defined as:
CREATE TABLE test.bykey_tbl(id MEDIUMINT NOT NULL AUTO_INCREMENT,
                           dt TIMESTAMP, user CHAR(255), uuidf LONGBLOB,
                           fkid MEDIUMINT, filler VARCHAR(255),
                           PRIMARY KEY(id)) ENGINE='innodb'
                                PARTITION BY KEY(id) partitions 5;

CREATE TABLE test.byrange_tbl(id MEDIUMINT NOT NULL AUTO_INCREMENT,
                           dt TIMESTAMP, user CHAR(255), uuidf LONGBLOB,
                           fkid MEDIUMINT, filler VARCHAR(255),
                           PRIMARY KEY(id)) ENGINE='innodb'
                                PARTITION BY RANGE(id)
                                SUBPARTITION BY hash(id) subpartitions 2
                                (PARTITION pa1 values less than (10),
                                 PARTITION pa2 values less than (20),
                                 PARTITION pa3 values less than (30),
                                 PARTITION pa4 values less than (40),
                                 PARTITION pa5 values less than (50),
                                 PARTITION pa6 values less than (60),
                                 PARTITION pa7 values less than (70),
                                 PARTITION pa8 values less than (80),
                                 PARTITION pa9 values less than (90),
                                 PARTITION pa10 values less than (100),
                                 PARTITION pa11 values less than MAXVALUE);
The tables are populated using stored procedures that populate 1000 rows and then remove every second line (see attached test case) to the end result is 
that each table has 500 lines after the call to the storered procedure.

However on the slave only SOME of the rows are deleted from each of the tables
so te end result is that the slave tables have MORE rows than than are expected.

Note:
 - No replication errors are reported and replication is reported as completed
   successfuly
 - The problem is not observed with non partitioned tables (see test case)
 - the problem is not observed if using myisam
 - The problem is not observed if the slave database has 'innodb' disabled
   i.e. the tables are created on the slave as myisam.
 - To see the differences download the attached rpl_test_db.tar.gz file

How to repeat:
download the attached test and result file and copy them to '/t' and '/r'.

To see the problem as described run:
perl ./mysql-test-run.pl --mysqld="--innodb" rpl_part_bug       

To see the problem does not exist if innodb is disabled on the slave run 
perl ./mysql-test-run.pl rpl_part_bug       

To see that the problem is not observed with statement based replication run
perl ./mysql-test-run.pl --mysqld="--binlog_format=statement" \
                         --mysqld="--innodb" rpl_part_bug 

To see the problem is not observed with MyISAM change the value of the $engine_type variable in the test case

Suggested fix:
Replicated tables should be identincal
[15 May 2007 1:59] Omer Barnir
example of the diffrences between master and slave

Attachment: rpl_test_db.tar.gz (application/x-gunzip, text), 38.62 KiB.

[15 May 2007 2:00] Omer Barnir
Test case for the above

Attachment: rpl_part_bug.test (application/octet-stream, text), 4.88 KiB.

[15 May 2007 2:02] Omer Barnir
Expected results for the above test case

Attachment: rpl_part_bug.result (application/octet-stream, text), 5.40 KiB.

[15 May 2007 13:31] Omer Barnir
Note, the number of rows can be controlled by changing the value of 'ins_count' in the stored procedures. 
For the 'byrange' table the problem shows with a number as small as 10
For the 'bykey' table the number needs to be at least 200 for the problem to show.
I have also noticed that all the 'deleted' entries exist in the relay logs as expected.
[16 May 2007 1:21] Omer Barnir
Correcxtion/clarification to the above:
show status on the slave reports:
Error#: 1032
Error in Delete_rows event: error during transaction execution on table test.pinsdel1_tbl
[16 May 2007 1:24] Omer Barnir
*** Please Ignore Last comment from  [16 May 3:21] ***
show status on the slave reports:
Error#: 1032
Error in Delete_rows event: error during transaction execution on table
test.bykey_tbl
[18 Jun 2007 11:33] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/28982

ChangeSet@1.2554, 2007-06-18 15:33:46+05:00, holyfoot@mysql.com +1 -0
  Bug #28430 Failiure in replication of innodb partitiitioned tables on row/mixed format.
  
  In the ha_partition::position() we didn't calculate the number
  of the partition of the record. We used m_last_part value
  instead, relying on that it is set in other place like
  previous call of ::write_row() or something.
  In replication, we do neither of these calls before ::position()
  Delete_rows_log_event::do_exec_row calls find_and_fetch_row
  there in the case of InnoDB partitions have HA_PRIMARY_KEY_REQUIRED_FOR_POSITION
  so use position() / rnd_pos() calls to fetch the record.
  Fixed by adding partition_id calculation to the ha_partition::position()
[25 Jun 2007 6:16] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29475

ChangeSet@1.2554, 2007-06-25 10:15:05+05:00, holyfoot@mysql.com +1 -0
  Bug #28430 Failiure in replication of innodb partitiitioned tables on row/mixed format.
  
  In the ha_partition::position() we didn't calculate the number
  of the partition of the record. We used m_last_part value
  instead, relying on that it is set in other place like
  previous call of ::write_row() or something.
  In replication, we do neither of these calls before ::position()
  Delete_rows_log_event::do_exec_row calls find_and_fetch_row
  there in the case of InnoDB partitions have HA_PRIMARY_KEY_REQUIRED_FOR_POSITION
  so use position() / rnd_pos() calls to fetch the record.
  Fixed by adding partition_id calculation to the ha_partition::position()
[26 Jun 2007 12:05] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29589

ChangeSet@1.2514, 2007-06-26 15:33:57+05:00, holyfoot@mysql.com +3 -0
  Bug #28430 Failure in replication of innodb partitioned tables on row/mixed format.
  
  In the ha_partition::position we don't calculate the number of
  the partition of the record. We use m_last_part_value instead relying on
  that it is set in other place like previous calls of ::write_row().
  In replication we do neither of these calls before ::position().
  Delete_row_log_event::do_exec_row calls find_and_fetch_row() where
  we used position() & rnd_pos() calls to find the record for the
  PARTITION/INNODB table as it posesses InnoDB table flags.
  Fixed by removing HA_PRIMARY_KEY_REQUIRED_FOR_POSITION flag from PARTITION
[26 Jun 2007 12:06] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29590

ChangeSet@1.2514, 2007-06-26 15:36:38+05:00, holyfoot@mysql.com +3 -0
  Bug #28430 Failure in replication of innodb partitioned tables on row/mixed format.
  
  In the ha_partition::position we don't calculate the number of
  the partition of the record. We use m_last_part_value instead relying on
  that it is set in other place like previous calls of ::write_row().
  In replication we do neither of these calls before ::position().
  Delete_row_log_event::do_exec_row calls find_and_fetch_row() where
  we used position() & rnd_pos() calls to find the record for the
  PARTITION/INNODB table as it posesses InnoDB table flags.
  Fixed by removing HA_PRIMARY_KEY_REQUIRED_FOR_POSITION flag from PARTITION
[28 Jun 2007 12:33] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29851

ChangeSet@1.2518, 2007-06-28 16:32:43+05:00, holyfoot@mysql.com +3 -0
  Bug #28430 Failure in replication of innodb partitioned tables on row/mixed format.
  
  In the ha_partition::position() we didn't calculate the number
  of the partition of the record. We used m_last_part value instead,
  relying on that it is set in other place like previous call of a method
  like ::write_row(). In replication we don't call any of these befor
  position(). Delete_rows_log_event::do_exec_row calls find_and_fetch_row.
  In case of InnoDB-based PARTITION table, we have HA_PRIMARY_KEY_REQUIRED_FOR_POSITION
  enabled, so use position() / rnd_pos() calls to fetch the record.
  
  Fixed by adding partition_id calculation to the ha_partition::position()
[28 Jun 2007 12:35] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29852

ChangeSet@1.2518, 2007-06-28 16:35:21+05:00, holyfoot@mysql.com +3 -0
  Bug #28430 Failure in replication of innodb partitioned tables on row/mixed format.
  
  In the ha_partition::position() we didn't calculate the number
  of the partition of the record. We used m_last_part value instead,
  relying on that it is set in other place like previous call of a method
  like ::write_row(). In replication we don't call any of these befor
  position(). Delete_rows_log_event::do_exec_row calls find_and_fetch_row.
  In case of InnoDB-based PARTITION table, we have HA_PRIMARY_KEY_REQUIRED_FOR_POSITION
  enabled, so use position() / rnd_pos() calls to fetch the record.
  
  Fixed by adding partition_id calculation to the ha_partition::position()
[1 Jul 2007 19:57] Bugs System
Pushed into 5.1.21-beta
[5 Jul 2007 13:46] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented bugfix in 5.1.21 changelog.
[14 Aug 2007 14:04] Omer Barnir
The problem is still observed with 5.1.21. Re opening the bug
[23 Aug 2007 10:10] Mikael Ronström
Given that this one comes back as a regression I would prefer if a new solution is tried
where the position call actually can trust that a read was done before and that therefore
the partition id is set already.

The only exception to the semantics of the position call is this call from replication. So I
would prefer if the replication code uses a separate handler call that is normally implemented
as position but for the partition handler translates into the new position call as the patch used.
I don't really like the idea of changing the semantics of the position call.
[23 Aug 2007 19:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/32992

ChangeSet@1.2571, 2007-08-23 23:34:48+05:00, holyfoot@mysql.com +3 -0
  Bug #28430 Failure in replication of innodb partitioned tables on row/mixed format.
  
  In the ha_partition::position() we didn't calculate the number
  of the partition of the record. We used m_last_part value instead,
  relying on that it is set in other place like previous call of a method
  like ::write_row(). In replication we don't call any of these befor
  position(). Delete_rows_log_event::do_exec_row calls find_and_fetch_row.
  In case of InnoDB-based PARTITION table, we have HA_PRIMARY_KEY_REQUIRED_FOR_POSITION
  enabled, so use position() / rnd_pos() calls to fetch the record.
  
  Fixed by adding partition_id calculation to the ha_partition::position()
[23 Aug 2007 20:36] Mikael Ronström
The patch is from what I can see exactly the same patch as before.
Also I don't really think it's a good idea to touch column_bitmap_signal,
this is called many times in a query.

Did you read my last comment?
[24 Aug 2007 7:20] Bugs System
Pushed into 5.1.22-beta
[24 Aug 2007 9:32] Jon Stephens
Hello, is this bug actually fixed or not? Thanks!
[24 Aug 2007 10:09] Mikael Ronström
Jon,
Since it is pushed it is fixed, but it is also reopened to fix it in another way.
Much like bug#18198 which was reopened 3 times :)
[5 Sep 2007 0:50] Omer Barnir
The bug is still not fixed and now the scenario described in the bug report causes the slave to crash at the point the where the test case is waiting for the slave to be sync'd with the master.

error displayed on the screen is:
mysqltest: At line 151: failed in 'select master_pos_wait('master-bin.000001', 945649)': 2006: MySQL server has gone away

Note line 151 is the executuion of "-sync_slave_with_master" command

The slave mysqld process is grashing after getting a signal 6 follwing InnoDB getting an "Assertion failure in thread 1106172848 in file row/row0sel.c line 4599"

With backtrace:
0x8222ab3
0xffffe410
0x400eedbb
0x846a415
0x83de537
0x83de77d
0x83de8d4
0x830ec51
0x82dc38d
0x834fa6f
0x40032297
0x4018437e

(see attached slave.err file for more information)

Resolving the stack trace includes the following information:
0x8222ab3 handle_segfault + 819
0xffffe410 _end + -142709184
0x400eedbb _end + 932018155
0x846a415 row_search_max_autoinc + 917
0x83de537 _ZN11ha_innobase31innobase_read_and_init_auto_incEPx + 247
0x83de77d _ZN11ha_innobase27innobase_get_auto_incrementEPy + 237
0x83de8d4 _ZN11ha_innobase10delete_rowEPKh + 260
0x830ec51 _ZN7handler13ha_delete_rowEPKh + 33
0x82dc38d _ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info + 701
0x834fa6f handle_slave_sql + 1727
0x40032297 _end + 931245255
0x4018437e _end + 932629934

Resolving the stack trace
[5 Sep 2007 0:52] Omer Barnir
mysqld slave.err file including crash information

Attachment: slave.err (application/octet-stream, text), 3.81 KiB.

[7 Sep 2007 14:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/33912

ChangeSet@1.2586, 2007-09-07 18:41:49+05:00, holyfoot@mysql.com +4 -0
  Bug #28430 Failure in replication of innodb partitioned tables on row/mixed format.
  
  In the ha_partition::position() we don't calculate the number
  of the partition of the record, but use m_last_part value instead,
  relying on that it's previously set by some other call like ::write_row().
  Delete_rows_log_event::do_exec_row() calls find_and_fetch_row(),
  where we used position() + rnd_pos() call for the InnoDB-based PARTITION-ed
  table as there HA_PRIMARY_KEY_REQUIRED_FOR_POSITION enabled.
  
  fixed by introducing new handler::rnd_pos_by_record() method to be
  used for random record-based positioning
[7 Sep 2007 17:35] Omer Barnir
hf has isolated the crash problem as un related to partitions but to deleting rows (see bug=30888). Also while investigating the issue found 30907 relating to inserts into innodb partitioned tables when using --innodb_autoinc_lock_mode=0
[8 Sep 2007 9:18] Mikael Ronström
I like this second solution much better
[12 Sep 2007 17:18] Paul DuBois
Noted in 5.1.22 changelog.

Replication of InnoDB partitioned tables could lose updates with
row-based or mixed replication format.

Report remains in Patch queued status.
[14 Sep 2007 7:45] Bugs System
Pushed into 5.1.23-beta
[20 Sep 2007 7:16] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Already documented; closed after discussing with Paul.
[5 Oct 2007 17:57] Bugs System
Pushed into 5.1.23-beta
[20 Nov 2007 7:46] Jon Stephens
Moved bugfix report to 5.1.23 changelog, given failure was still noted in 5.1.22 with subsequent pushes to 5.1.23.
[5 May 2010 15:09] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 2:58] Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[28 May 2010 6:10] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:38] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 7:06] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 May 2010 2:58] Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[17 Jun 2010 12:15] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:02] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:43] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)