Bug #29549 | Endians: rpl_ndb_myisam2ndb,rpl_ndb_innodb2ndb and rpl_ndb_mix_innodb failed on | ||
---|---|---|---|
Submitted: | 4 Jul 2007 12:44 | Modified: | 28 Nov 2007 10:01 |
Reporter: | justin he | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Row Based Replication ( RBR ) | Severity: | S1 (Critical) |
Version: | 5.1.21-beta | OS: | Solaris |
Assigned to: | Mats Kindahl | CPU Architecture: | Any |
Tags: | mysql-5.1-new-rpl |
[4 Jul 2007 12:44]
justin he
[7 Jul 2007 10:55]
Sveta Smirnova
Thank you for the report. Verified as described.
[10 Jul 2007 7:21]
Guangbao Ni
Bug#28123 is a duplicate of this.
[10 Jul 2007 8:39]
Andrei Elkin
The problem stems from that field::unpack deals with "local" (read from the table) and "remote" (read from replication event) instances. In the latter case endianess matters and that with combination of storages having different assumptions on endianess leads to the bug.
[10 Jul 2007 11:41]
Lars Thalmann
Perhaps related to WL#3228.
[24 Aug 2007 8:42]
Tomas Ulin
proposed patch
Attachment: tmp2.patch (text/x-patch), 822 bytes.
[27 Aug 2007 0:39]
Guangbao Ni
after patching, rpl_ndb_mix_innodb still fail ./mysql-test-run.pl --do-test=rpl_ndb_mix_innodb =====================================
Attachment: rpl_ndb_mix_innodb.tar.gz (application/x-gzip, text), 23.68 KiB.
[27 Aug 2007 0:39]
Guangbao Ni
mysql-test-run: WARNING: Forcing kill of process 11629 rpl_ndb.rpl_ndb_mix_innodb [ fail ] Errors are (from /export/home/ngb/target-5.1.22/mysql-test/var/log/mysqltest-time) : mysqltest: In included file "./extra/rpl_tests/rpl_ndb_apply_status.test": At line 282: command "diff_files" failed with error 1 (the last lines may be the most important ones) Aborting: rpl_ndb.rpl_ndb_mix_innodb failed in default mode. To continue, re-run with '--force'. Stopping All Servers mysql-test-run: WARNING: Forcing kill of process 11657
[27 Aug 2007 0:45]
Guangbao Ni
after patching, rpl_ndb_innodb2ndb fails, log attached
Attachment: rpl_ndb_innodb2ndb.tar.gz (application/x-gzip, text), 2.68 KiB.
[27 Aug 2007 13:33]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/33151 ChangeSet@1.2577, 2007-08-27 15:40:49+02:00, tomas@whalegate.ndb.mysql.com +2 -0 Bug#29549 rpl_ndb_myisam2ndb,rpl_ndb_innodb2ndb failed on Solaris for pack_length issue
[28 Aug 2007 1:15]
Guangbao Ni
======================================================= TEST RESULT TIME (ms) ------------------------------------------------------- mysql-test-run: WARNING: Forcing kill of process 22526 mysql-test-run: WARNING: Forcing kill of process 22534 rpl_ndb.rpl_ndb_mix_innodb [ fail ] Errors are (from /export/home/ngb/target_solaris/mysql-test/var/log/mysqltest-time) : mysqltest: In included file "./extra/rpl_tests/rpl_ndb_apply_status.test": At line 282: command "diff_files" failed with error 1 (the last lines may be the most important ones) Aborting: rpl_ndb.rpl_ndb_mix_innodb failed in default mode. To continue, re-run with '--force'. Stopping All Servers mysql-test-run: WARNING: Forcing kill of process 22550 From the file diff_files, it seems that the datatime and some other types can't match. So i think it is related to the bug#30024, bug#30133 and bug#30134
[28 Aug 2007 1:20]
Guangbao Ni
the two different result files
Attachment: diff_files.tar.gz (application/x-gzip, text), 6.27 KiB.
[28 Aug 2007 5:35]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/33192 ChangeSet@1.2582, 2007-08-28 07:42:43+02:00, tomas@whalegate.ndb.mysql.com +2 -0 Bug#29549 rpl_ndb_myisam2ndb,rpl_ndb_innodb2ndb failed on Solaris for pack_length issue
[28 Aug 2007 10:08]
Tomas Ulin
it seems endian is an even bigger problem than first discovered...
[29 Aug 2007 3:14]
Guangbao Ni
The following is the smallest test case for rpl_ndb_mix_innodb.test connection master; --disable_warnings DROP DATABASE IF EXISTS tpcb; --enable_warnings CREATE DATABASE tpcb; CREATE TABLE tpcb.history (id MEDIUMINT NOT NULL AUTO_INCREMENT,aid INT, tid INT, bid INT, amount DECIMAL(10,2), tdate DATETIME, teller CHAR(20), uuidf LONGBLOB, filler CHAR(80),PRIMARY KEY (id)); # Switch tables on slave to use NDB --sync_slave_with_master USE tpcb; ALTER TABLE history ENGINE NDB; # Load DB tpcb and run some transactions connection master; USE tpcb; INSERT INTO tpcb.history VALUES(NULL,77,6,10,'1.00', NOW(), USER(), UUID(),'completed trans'); --sync_slave_with_master --echo *** DUMP MASTER & SLAVE FOR COMPARE ******** --exec $MYSQL_DUMP -n -t --compact --order-by-primary --skip-extended-insert tpcb history > $MYSQLTEST_VARDIR/tmp/master_apply_status.sql --exec $MYSQL_DUMP_SLAVE -n -t --compact --order-by-primary --skip-extended-insert tpcb history > $MYSQLTEST_VARDIR/tmp/slave_apply_status.sql --echo ****** Do dumps compare ************ diff_files $MYSQLTEST_VARDIR/tmp/master_apply_status.sql $MYSQLTEST_VARDIR/tmp/slav e_apply_status.sql; ## Note: Ths files should only get removed, if the above diff succeeds. --exec rm $MYSQLTEST_VARDIR/tmp/master_apply_status.sql --exec rm $MYSQLTEST_VARDIR/tmp/slave_apply_status.sql # End of 5.1 Test There are the following conclusions: 1, if the master and the slave have the same engine, there is no any problem. 2.if NOW(), USER() and UUID() is replaced by concrete value, there is no any problem. So, NOW(), USER() and UUID() functions affect the result between master(engine=innodb) and slave(engine=ndb).
[29 Aug 2007 7:38]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/33272 ChangeSet@1.2589, 2007-08-29 09:44:37+02:00, tomas@whalegate.ndb.mysql.com +2 -0 Bug#29549 rpl_ndb_myisam2ndb,rpl_ndb_innodb2ndb failed on Solaris for pack_length issue - reverting patch as there where unknows sideeffects that we do not have time to follow up on just now
[29 Aug 2007 11:25]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/33294 ChangeSet@1.2556, 2007-08-29 13:24:24+02:00, mats@kindahl-laptop.dnsalias.net +1 -0 BUG#29549 (rpl_ndb_circular.test and rpl_ndb_log.test fail): The fields slave_running, io_thd, and sql_thread are guarded by an associated run_lock. A read of these fields were not guarded inside terminate_slave_threads(), which caused an assertion to fire and potentially can cause a segmentation fault in some rare cases. This patch move the reading of the above variables to occur inside terminate_slave_thread() guarded by the associated run_lock.
[4 Sep 2007 11:27]
Lars Thalmann
See also BUG#24231.
[4 Sep 2007 12:29]
Rafal Somla
Note that the last patch belongs to BUG#29968 and landed here by mistake.
[4 Sep 2007 12:45]
Rafal Somla
As noted by Tomas [29 Aug 9:50] this problem is not specific to blobs. Here is a simple test which shows problems in replicating other types of fields: ------------------------------------------------------------------------- --source include/have_ndb.inc --source include/have_innodb.inc --source include/have_binlog_format_mixed_or_row.inc --source include/master-slave.inc connection master; SET storage_engine=ndb; connection slave; SET storage_engine=myisam; connection master; CREATE TABLE t1 (a INT, b FLOAT); SHOW CREATE TABLE t1; sync_slave_with_master; SHOW CREATE TABLE t1; connection master; insert into t1 values (1,3.333); select * from t1; sync_slave_with_master; select * from t1; --------------------------------------------------------------------------- If this test is run on big-endian engine (solaris) wrong values are inserted into t1 on slave because myisam driver assumes values are written little-endian but receives them saved big-endian. Here is a simpler version of the test which doesn't require replication setting --------------------------------------------------------------------------- CREATE TABLE t1 (a INT, b FLOAT); SHOW CREATE TABLE t1; #794 11:12:25 server id 1 end_log_pos 394 # Position Timestamp Type Master ID Size Master Pos Flags # 15f 79 21 dd 46 13 01 00 00 00 2b 00 00 00 8a 01 00 00 00 00 # 172 16 00 00 00 00 00 00 00 04 74 65 73 74 00 02 74 |.........test..t| # 182 31 00 02 03 04 01 04 03 04 74 65 73 74 00 02 74 |1.......| # Table_map: `test`.`t1` mapped to number 22 #794 11:12:25 server id 1 end_log_pos 453 # Position Timestamp Type Master ID Size Master Pos Flags # 18a 79 21 dd 46 17 01 00 00 00 3b 00 00 00 c5 01 00 00 00 00 # 19d 10 00 00 00 00 00 00 00 05 1f e0 00 00 00 01 00 |................| # 1ad 00 00 00 00 00 00 28 00 00 00 00 00 00 00 00 00 |................| # 1bd 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |........| # Write_rows: table id 16 #794 11:12:25 server id 1 end_log_pos 491 # Position Timestamp Type Master ID Size Master Pos Flags # 1c5 79 21 dd 46 17 01 00 00 00 26 00 00 00 eb 01 00 00 10 00 # 1d8 16 00 00 00 00 00 01 00 02 03 fc 00 00 00 01 40 |................| # 1e8 55 4f df |UO.| # Write_rows: table id 22 flags: STMT_END_F BINLOG ' eSHdRhMBAAAAKwAAAIoBAAAAABYAAAAAAAAABHRlc3QAAnQxAAIDBAEEAw== eSHdRhcBAAAAOwAAAMUBAAAAABAAAAAAAAAABR/gAAAAAQAAAAAAAAAoAAAAAAAAAAAAAAAAAAAA AAA= eSHdRhcBAAAAJgAAAOsBAAAQABYAAAAAAAEAAgP8AAAAAUBVT98= '; select * from t1; --------------------------------------------------------------------------- The binlog entry is taken from the first test which was run on solaris. One can see that values in the last Write_rows event are written big-endian (e.g bytes "00 00 00 01" starting from pos 1e3 and encoding integer value 1). This test can be run on any machine as myisam is endian-independent and will always missinterpret these binlong entries. For the same reasons, new test rpl_ndb_2other (introduced in 5.1.22) fails with wrong values if run on big-endian architecture.
[4 Sep 2007 17:12]
Bugs System
Pushed into 5.1.23-beta
[6 Sep 2007 20:06]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/33863 ChangeSet@1.2571, 2007-09-06 22:05:03+02:00, mats@kindahl-laptop.dnsalias.net +4 -0 BUG#29549 (Endians: test failures on Solaris): Refactoring code to add parameter to pack() and unpack() functions with purpose of indicating if data should be packed in little-endian or native order. Using new functions to always pack data for binary log in little-endian order. Eliminating several versions of virtual pack() and unpack() functions in favor for one single virtual function which is overridden in subclasses.
[7 Sep 2007 9:58]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/33894 ChangeSet@1.2571, 2007-09-07 11:54:08+02:00, mats@kindahl-laptop.dnsalias.net +12 -0 BUG#29549 (Endians: test failures on Solaris): Refactoring code to add parameter to pack() and unpack() functions with purpose of indicating if data should be packed in little-endian or native order. Using new functions to always pack data for binary log in little-endian order. Eliminating several versions of virtual pack() and unpack() functions in favor for one single virtual function which is overridden in subclasses.
[14 Sep 2007 16:52]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/34287 ChangeSet@1.2571, 2007-09-14 18:51:59+02:00, mats@kindahl-laptop.dnsalias.net +20 -0 BUG#29549 (Endians: test failures on Solaris): Refactoring code to add parameter to pack() and unpack() functions with purpose of indicating if data should be packed in little-endian or native order. Using new functions to always pack data for binary log in little-endian order. The purpose of this refactoring is to allow proper implementation of endian-agnostic pack() and unpack() functions. Eliminating several versions of virtual pack() and unpack() functions in favor for one single virtual function which is overridden in subclasses. Implementing pack() and unpack() functions for some field types that packed data in native format regardless of the value of the st_table_share::db_low_byte_first flag.
[2 Oct 2007 15:37]
Chuck Bell
Patch approved.
[4 Oct 2007 19:41]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/34928 ChangeSet@1.2571, 2007-10-04 21:41:18+02:00, mats@kindahl-laptop.dnsalias.net +20 -0 BUG#29549 (Endians: test failures on Solaris): Refactoring code to add parameter to pack() and unpack() functions with purpose of indicating if data should be packed in little-endian or native order. Using new functions to always pack data for binary log in little-endian order. The purpose of this refactoring is to allow proper implementation of endian-agnostic pack() and unpack() functions. Eliminating several versions of virtual pack() and unpack() functions in favor for one single virtual function which is overridden in subclasses. Implementing pack() and unpack() functions for some field types that packed data in native format regardless of the value of the st_table_share::db_low_byte_first flag. The field types that were packed in native format regardless are: Field_real, Field_decimal, Field_tiny, Field_short, Field_medium, Field_long, and Field_longlong. Before the patch, row-based logging wrote the rows incorrectly on big-endian machines where the storage engine defined its own low_byte_first() to be FALSE on big-endian machines (the default is TRUE), while little-endian machines wrote the fields in correct order. The only known storage engine that does this is NDB. In effect, this means that row-based replication from or to a big-endian machine where the table was using NDB as storage engine failed if the other engine was either non-NDB or on a little-endian machine. With this patch, row-based logging is now always done in little-endian order, while ORDER BY uses the native order if the storage engine defines low_byte_first() to return FALSE for big-endian machines.
[5 Oct 2007 16:16]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/34993 ChangeSet@1.2571, 2007-10-05 18:16:11+02:00, mats@kindahl-laptop.dnsalias.net +20 -0 BUG#29549 (Endians: test failures on Solaris): Refactoring code to add parameter to pack() and unpack() functions with purpose of indicating if data should be packed in little-endian or native order. Using new functions to always pack data for binary log in little-endian order. The purpose of this refactoring is to allow proper implementation of endian-agnostic pack() and unpack() functions. Eliminating several versions of virtual pack() and unpack() functions in favor for one single virtual function which is overridden in subclasses. Implementing pack() and unpack() functions for some field types that packed data in native format regardless of the value of the st_table_share::db_low_byte_first flag. The field types that were packed in native format regardless are: Field_real, Field_decimal, Field_tiny, Field_short, Field_medium, Field_long, and Field_longlong. Before the patch, row-based logging wrote the rows incorrectly on big-endian machines where the storage engine defined its own low_byte_first() to be FALSE on big-endian machines (the default is TRUE), while little-endian machines wrote the fields in correct order. The only known storage engine that does this is NDB. In effect, this means that row-based replication from or to a big-endian machine where the table was using NDB as storage engine failed if the other engine was either non-NDB or on a little-endian machine. With this patch, row-based logging is now always done in little-endian order, while ORDER BY uses the native order if the storage engine defines low_byte_first() to return FALSE for big-endian machines.
[9 Oct 2007 14:08]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/35199 ChangeSet@1.2572, 2007-10-09 16:08:00+02:00, mats@kindahl-laptop.dnsalias.net +3 -0 BUG#29549 (Endians: rpl_ndb_myisam2ndb,rpl_ndb_innodb2ndb and rpl_ndb_mix_innodb failed on): Adding Field::max_data_length() to give the maximum number of bytes that Field::pack() will write.
[11 Oct 2007 16:18]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/35386 ChangeSet@1.2571, 2007-10-11 18:18:05+02:00, mats@kindahl-laptop.dnsalias.net +20 -0 BUG#29549 (Endians: test failures on Solaris): Refactoring code to add parameter to pack() and unpack() functions with purpose of indicating if data should be packed in little-endian or native order. Using new functions to always pack data for binary log in little-endian order. The purpose of this refactoring is to allow proper implementation of endian-agnostic pack() and unpack() functions. Eliminating several versions of virtual pack() and unpack() functions in favor for one single virtual function which is overridden in subclasses. Implementing pack() and unpack() functions for some field types that packed data in native format regardless of the value of the st_table_share::db_low_byte_first flag. The field types that were packed in native format regardless are: Field_real, Field_decimal, Field_tiny, Field_short, Field_medium, Field_long, Field_longlong, and Field_blob. Before the patch, row-based logging wrote the rows incorrectly on big-endian machines where the storage engine defined its own low_byte_first() to be FALSE on big-endian machines (the default is TRUE), while little-endian machines wrote the fields in correct order. The only known storage engine that does this is NDB. In effect, this means that row-based replication from or to a big-endian machine where the table was using NDB as storage engine failed if the other engine was either non-NDB or on a little-endian machine. With this patch, row-based logging is now always done in little-endian order, while ORDER BY uses the native order if the storage engine defines low_byte_first() to return FALSE for big-endian machines. In addition, the max_data_length() function available in Field_blob was generalized to the entire Field hierarchy to give the maximum number of bytes that Field::pack() will write.
[12 Oct 2007 16:22]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/35486 ChangeSet@1.2573, 2007-10-12 18:22:31+02:00, mats@kindahl-laptop.dnsalias.net +1 -0 BUG#29549 (Endians: rpl_ndb_myisam2ndb,rpl_ndb_innodb2ndb and rpl_ndb_mix_innodb failed on): Post-merge fixes. Setting write bit before calling Field::store() since the function asserts that the write bit has been set.
[27 Nov 2007 10:51]
Bugs System
Pushed into 5.1.23-rc
[27 Nov 2007 10:54]
Bugs System
Pushed into 6.0.4-alpha
[28 Nov 2007 10:01]
Jon Stephens
Documented bugfix in 5.1.23 and 6.0.4 changelogs. Mats, thank you for good comments/summary of issue.