Bug #44830 | SLAVE START no longer results in error if RESTORE is running on master | ||
---|---|---|---|
Submitted: | 12 May 2009 17:04 | Modified: | 16 Sep 2009 15:48 |
Reporter: | Chuck Bell | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) |
Version: | 6.0.11 | OS: | Any |
Assigned to: | Libing Song | CPU Architecture: | Any |
[12 May 2009 17:04]
Chuck Bell
[12 May 2009 17:36]
Chuck Bell
Test is in mysql-6.0-backup tree. An alternative way to reproduce the problem is: * Setup a master and a slave using --console but do not connect the slave. * Run a backup (of any database). * Use a debugger and set a breakpoint in the master server kernel.cc @225 : res= context.do_restore(overwrite); * Connect the slave while the master is paused at the breakpoint. * Observe the START SLAVE completes without errors. * Resume the restore on the master. * Do anything to prompt the slave to fetch from the master. * Observer error appears in slave's console but not in the slave client. * Attempt SHOW ERRORS, etc. on the slave. * Observe error does not show anywhere.
[12 May 2009 17:40]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/73852 2809 Chuck Bell 2009-05-12 BUG#44830 : SLAVE START no longer results in error if RESTORE is running on master Disabled portions of rpl_backup test because the slave no longer returns an error when a restore is in progress. This is a change in the way the slaves connect to the master and must be fixed. It has broken the ability to block slave connections (via START SLAVE) while a restore is in progress. modified: mysql-test/suite/rpl/r/rpl_backup_block.result mysql-test/suite/rpl/t/rpl_backup_block.cnf mysql-test/suite/rpl/t/rpl_backup_block.test
[12 May 2009 17:42]
Chuck Bell
Previous patch is to disable affected portions of the test.
[12 May 2009 17:51]
Chuck Bell
CORRECTION An alternative way to reproduce the problem is: * Setup a master and a slave using --console but do not connect the slave. * Run a backup (of any database). * Use a debugger and set a breakpoint in the master server in kernel.cc @225 : res= context.do_restore(overwrite); * On the master, run the restore of the database backed up previously (use OVERWRITE). * Allow code to break at the breakpoint. * Connect the slave while the master is paused at the breakpoint. * Observe the START SLAVE completes without errors. * Resume the restore on the master. * Do anything to prompt the slave to fetch from the master. * Observe error appears in slave's console but not in the slave client. * Attempt SHOW ERRORS, etc. on the slave. * Observe error does not show anywhere.
[12 May 2009 18:01]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/73857 2809 Chuck Bell 2009-05-12 BUG#44830 : SLAVE START no longer results in error if RESTORE is running on master Disabled portions of rpl_backup test because the slave no longer returns an error when a restore is in progress. This is a change in the way the slaves connect to the master and must be fixed. It has broken the ability to block slave connections (via START SLAVE) while a restore is in progress. Previously, when a slave attempted to connect to a master that had a restore in progress, the START SLAVE command would fail and an error would be sent to the client. Now, the command succeeds and no error is sent to the client. Since this test relies on detecting the error, it fails when run against the latest code. It is likely the mechanism for how the slave connects and/or the sequence of events for detecting errors has changed thereby causing the slave to delay the detection of being blocked by a restore run on the master. modified: mysql-test/suite/rpl/r/rpl_backup_block.result mysql-test/suite/rpl/t/rpl_backup_block.cnf mysql-test/suite/rpl/t/rpl_backup_block.test
[20 Aug 2009 10:36]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/81150 2859 Li-Bing.Song@sun.com 2009-08-20 BUG#44830 SLAVE START no longer results in error if RESTORE is running on master In fact, We can not result in an error of START SLAVE command. START SLAVE exits successfully as soon as I/O thread and SQL thread are started. It does not wait I/O thread to connect to master. Slave does not know status of master include if RESTORE command is running in master, before I/O thread has connected to master. I/O thread sends binlog request to master then waits to receive something. Master receives the request and sends an error "ER_MASTER_BLOCKING_SLAVES" to slave if RESTORE command is running. I just wrote code to report the error and exit when slave receives an error "ER_MASTER_BLOCKING_SLAVES".
[20 Aug 2009 10:36]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/81151 2859 Li-Bing.Song@sun.com 2009-08-19 BUG#44830 SLAVE START no longer results in error if RESTORE is running on master In fact, We can not result in an error of START SLAVE command. START SLAVE exits successfully as soon as I/O thread and SQL thread are started. It does not wait I/O thread to connect to master. Slave does not know status of master include if RESTORE command is running in master, before I/O thread has connected to master. I/O thread sends binlog request to master then waits to receive something. Master receives the request and sends an error "ER_MASTER_BLOCKING_SLAVES" to slave if RESTORE command is running. I just wrote code to report the error and exit when slave receives an error "ER_MASTER_BLOCKING_SLAVES".
[3 Sep 2009 7:47]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/82281 2859 Li-Bing.Song@sun.com 2009-09-03 BUG#44830 SLAVE START no longer results in error if RESTORE is running on master In fact, We can not result in an error of START SLAVE command. START SLAVE exits successfully as soon as I/O thread and SQL thread are started. It does not wait I/O thread to connect to master. Slave does not know status of master include if RESTORE command is running in master, before I/O thread has connected to master. When a slave requests binlog dump from a master, it will send an ER_MASTER_BLOCKING_SLAVES error to the slave and then stop the connection if RESTORE command is running on it. The slave must report an error and then stop the I/O thread after it recieves the error from the master. This patch wrote code to report the error and then stop I/O thread when slave receives an error "ER_MASTER_BLOCKING_SLAVES".
[5 Sep 2009 9:11]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/82523 2859 Li-Bing.Song@sun.com 2009-09-05 BUG#44830 SLAVE START no longer results in error if RESTORE is running on master In fact, We can not result in an error of START SLAVE command. START SLAVE exits successfully as soon as I/O thread and SQL thread are started. It does not wait I/O thread to connect to master. Slave does not know status of master include if RESTORE command is running in master, before I/O thread has connected to master. When a slave requests binlog dump from a master, it will send an ER_MASTER_BLOCKING_SLAVES error to the slave and then stop the connection if RESTORE command is running on it. The slave must report an error and then stop the I/O thread after it recieves the error from the master. This patch wrote code to report the error and then stop I/O thread when slave receives an error "ER_MASTER_BLOCKING_SLAVES".
[5 Sep 2009 9:23]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/82524 2810 Li-Bing.Song@sun.com 2009-09-05 BUG#44830 SLAVE START no longer results in error if RESTORE is running on master In fact, We can not result in an error of START SLAVE command. START SLAVE exits successfully as soon as I/O thread and SQL thread are started. It does not wait I/O thread to connect to master. Slave does not know status of master include if RESTORE command is running in master, before I/O thread has connected to master. When a slave requests binlog dump from a master, it will send an ER_MASTER_BLOCKING_SLAVES error to the slave and then stop the connection if RESTORE command is running on it. The slave must report an error and then stop the I/O thread after it recieves the error from the master. This patch wrote code to report the error and then stop I/O thread when slave receives an error "ER_MASTER_BLOCKING_SLAVES".
[15 Sep 2009 13:52]
Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090915134838-5nj3ycjfsqc2vr2f) (version source revid:li-bing.song@sun.com-20090905091947-qvhff5qgqugzr1tx) (merge vers: 5.4.4-alpha) (pib:11)
[16 Sep 2009 15:48]
Jon Stephens
Documented in the 5.4.4 changelog as follows: START SLAVE succeeded even if the IO thread did not connect to the master. Closed.