| Bug #38170 | rpl_variables failed on pushbuild: could not sync with master | ||
|---|---|---|---|
| Submitted: | 16 Jul 2008 13:34 | Modified: | 1 Oct 2008 14:23 |
| Reporter: | Sven Sandberg | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | Tests: Replication | Severity: | S7 (Test Cases) |
| Version: | 5.1-rpl | OS: | Any |
| Assigned to: | Sven Sandberg | CPU Architecture: | Any |
| Tags: | could not sync slave with master, pushbuild, rpl_variables, test failure | ||
[17 Jul 2008 14:34]
Sven Sandberg
The reason for the failure is that the test sets @@global.init_slave to some garbage value. The test executes START SLAVE before it modifies @@global.init_slave. This causes a race between the client thread that modifies @@global.init_slave and the slave SQL thread. With high probability, the old value of @@global.init_slave was used, which is ok. But the slave stops if the new (garbage) value of @@global.init_slave is used. In the test, the actual value of @@global.init_slave is insignificant, the only thing that matters is that it is different from the value on master. Therefore, we can fix the test case by setting @@global.init_slave to valid SQL.
[17 Jul 2008 16:22]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/49950 2703 Sven Sandberg 2008-07-17 BUG#38170: rpl_variables failed on pushbuild: could not sync with master Problem: the test set @@global.init_slave to garbage at a time which was not guaranteed to be after the time when the slave's SQL thread used it. That would cause the slave's SQL thread to stop in rare cases. Fix: The test does not care about the value of @@global.init_slave, except that it should be different on master and slave. Hence, we set @@global.init_slave to something that is valid SQL.
[18 Jul 2008 11:31]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/50023 2704 Sven Sandberg 2008-07-18 BUG#38178: rpl_loaddata_map fails sporadically in pushbuild Problem: master binlog has 'create table t1'. Master binlog was removed before slave could replicate it. In test's cleanup code, master did 'drop table t1', which caused slave sql thread to stop with an error since slave sql thread did not know about t1. Fix: t1 is just an auxiliary construction, only needed on master. Hence, we turn off binlogging before t1 is created, drop t1 as soon as we don't need it anymore, and then turn on binlogging again.
[18 Jul 2008 11:31]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/50024 2704 Sven Sandberg 2008-07-18 BUG#38178: rpl_loaddata_map fails sporadically in pushbuild Problem: master binlog has 'create table t1'. Master binlog was removed before slave could replicate it. In test's cleanup code, master did 'drop table t1', which caused slave sql thread to stop with an error since slave sql thread did not know about t1. Fix: t1 is just an auxiliary construction, only needed on master. Hence, we turn off binlogging before t1 is created, drop t1 as soon as we don't need it anymore, and then turn on binlogging again.
[14 Aug 2008 7:17]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/51600 2677 He Zhenxing 2008-08-14 [merge] Merge with mysql-5.1-rpl-testfixes
[15 Aug 2008 2:02]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/51688 2769 He Zhenxing 2008-08-15 [merge] Merge 6.0-rpl-testfixes -> 6.0
[19 Aug 2008 19:56]
Bugs System
Pushed into 6.0.7-alpha (revid:sven@mysql.com-20080717162659-b7p6joq1zu1255ov) (version source revid:sven@mysql.com-20080818175803-c1nutd5773r6b4gd) (pib:3)
[19 Aug 2008 20:30]
Paul DuBois
Test case changes. No changelog entry needed. Setting report to Patch Queued pending push of fix into 5.1.x.
[25 Aug 2008 21:05]
Chuck Bell
Released in 6.0.7
[27 Aug 2008 1:17]
Paul DuBois
Setting report to NDI pending push into 5.1.x.
[28 Aug 2008 20:17]
Bugs System
Pushed into 6.0.7-alpha (revid:cbell@mysql.com-20080822132131-uveo6wiuecy6m2b8) (version source revid:cbell@mysql.com-20080822132131-uveo6wiuecy6m2b8) (pib:3)
[29 Aug 2008 0:43]
Paul DuBois
Setting report to NDI pending push into 5.1.x.
[14 Sep 2008 1:24]
Bugs System
Pushed into 6.0.7-alpha (revid:sven@mysql.com-20080717162659-b7p6joq1zu1255ov) (version source revid:vvaintroub@mysql.com-20080804094710-jb2qpqxpf2ir2gf3) (pib:3)
[16 Sep 2008 4:25]
Paul DuBois
Setting report to NDI pending push into 5.1.x.

Description: rpl_variables failed with the following message: rpl.rpl_variables [ fail ] CURRENT_TEST: rpl.rpl_variables === SHOW MASTER STATUS === ---- 1. ---- File slave-bin.000001 Position 106 Binlog_Do_DB Binlog_Ignore_DB ========================== === SHOW SLAVE STATUS === ---- 1. ---- Slave_IO_State Waiting for master to send event Master_Host 127.0.0.1 Master_User root Master_Port 12040 Connect_Retry 1 Master_Log_File master-bin.000001 Read_Master_Log_Pos 34525 Relay_Log_File slave-relay-bin.000001 Relay_Log_Pos 4 Relay_Master_Log_File Slave_IO_Running Yes Slave_SQL_Running No Replicate_Do_DB Replicate_Ignore_DB Replicate_Do_Table Replicate_Ignore_Table Replicate_Wild_Do_Table Replicate_Wild_Ignore_Table Last_Errno 0 Last_Error Skip_Counter 0 Exec_Master_Log_Pos 0 Relay_Log_Space 34951 Until_Condition None Until_Log_File Until_Log_Pos 0 Master_SSL_Allowed No Master_SSL_CA_File Master_SSL_CA_Path Master_SSL_Cert Master_SSL_Cipher Master_SSL_Key Seconds_Behind_Master Master_SSL_Verify_Server_Cert No Last_IO_Errno 0 Last_IO_Error Last_SQL_Errno 0 Last_SQL_Error ========================= mysqltest: At line 696: could not sync with master ('select master_pos_wait('master-bin.000001', 34525)' returned NULL) The result from queries just before the failure was: < snip > 17 NULL 3 NULL 18 NULL 4 NULL 19 NULL 5 NULL 20 NULL 6 NULL 21 NULL NULL sv_SE 22 NULL NULL sv_FI 23 NULL NULL ar_TN 24 NULL NULL ar_IQ 25 NULL NULL 26 NULL NULL IGNORE_SPACE,NO_AUTO_CREATE_USER 27 NULL NULL NO_DIR_IN_CREATE,NO_AUTO_VALUE_ON_ZERO,NO_BACKSLASH_ESCAPES,NO_ENGINE_SUBSTITUTION 28 NULL NULL NO_KEY_OPTIONS,NO_TABLE_OPTIONS,NO_FIELD_OPTIONS 29 NULL 20 NULL 30 NULL 30 NULL 31 NULL NULL Bergsbrunna 32 NULL NULL Centrum Comparing tables master:test.tstmt and master:test.tproc Comparing tables master:test.tstmt and master:test.tfunc Comparing tables master:test.tstmt and master:test.ttrig Comparing tables master:test.tstmt and master:test.tprep More results from queries before failure can be found in /dev/shm/var-ps_row-121/6/log/rpl_variables.log Warnings from just before the error: Note 1051 Unknown table 't1' Note 1051 Unknown table 't2' Note 1051 Unknown table 't3' Note 1051 Unknown table 't4' Note 1051 Unknown table 't5' Note 1051 Unknown table 't6' Note 1051 Unknown table 't7' Note 1051 Unknown table 't8' - saving '/dev/shm/var-ps_row-121/6/log/rpl.rpl_variables/' to '/dev/shm/var-ps_row-121/log/rpl.rpl_variables/' Retrying test, attempt(2/3)... How to repeat: https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=bzr_mysql-5.1-rpl&order=21 , 'sles10-ia64-a' vanilla/ps_row and 'sles10-ia64-a' no innodb/n_mix Funny, it has not failed in any other push, see xref here: http://tinyurl.com/6dmlos However, the source diff for that push does not look like it could affect this test case. Note that the slave IO thread is running but slave SQL is not.