Bug #56442 Slave executes delayed statements when STOP SLAVE is issued
Submitted: 1 Sep 2010 9:23 Modified: 30 Nov 2010 22:20
Reporter: Giuseppe Maxia Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.6.0 OS:MacOS
Assigned to: Sven Sandberg CPU Architecture:Any
Tags: DELAYED, replication, slave, stop

[1 Sep 2010 9:23] Giuseppe Maxia
Description:
My understanding is that delayed replication can be used to recover from a human mistake. If such mistake happen before the delayed statement reaches the slave, we can use the slave data to recover from the disaster in the master.
However, it doesn't work this way:

1) set MASTER_DELAY=3600
2) START SLAVE
3) the master drops a table
4) check that the slave has not replicated the statement.
5) Realize that dropping the table was a mistake.
6) run STOP SLAVE before the 3600 seconds have elapsed, with the idea of using the slave data for recovering from the mistake.
7) however, STOP SLAVE causes the execution of the pending statement. 
 
To be clear, in step #7, the DROP TABLE statement was more than 3000 seconds away. Yet, when issuing STOP SLAVE, which is supposed to stop the SQL thread, the DROP statement runs IMMEDIATELY on the slave.

Notice that the manual warns about STOP SLAVE waiting for statements affecting non-transactional tables to be executed.
http://dev.mysql.com/doc/refman/5.6/en/stop-slave.html
I observe that a DROP TABLE issued on a InnoDB table should not qualify as an event to wait for, i.e. the STOP SLAVE statement should just stop.

Something is definitely wrong here. 
If this behavior is intended, then it defies the purpose of delayed replication. Consider the case where you realize that there was a mistake, and you have 1 minute before the delayed statement runs. How do you recover a 50 GB table that was dropped on the master? The natural action, running STOP SLAVE, will rush into the abyss instead of braking.

How to repeat:
Using MySQL Sandbox, create a normal replication system.
make_replication_sandbox /path/to/mysql.5.6.0-binary.tar.gz

then, run this shell script:

./s2 -e 'stop slave'
./s2 -e 'change master to master_delay=10'
./s2 -e 'start slave'
./use_all 'show slave status\G' | grep 'server\|Running:\|Delay:'

./m test -v -e 'create table test.t1(i int) engine=innodb'
echo "slave 1: expect 1 table"
./s1 -e 'show tables from test'
echo "slave 2: expect 0 tables"
./s2  -e 'show tables from test'

echo 'sleeping 10 seconds'
sleep 10

echo "slave 1: expect 1 table"
./s1 -e 'show tables from test'
echo "slave 2: expect 1 table"
./s2 -e 'show tables from test'

./m -v -e 'drop table test.t1'
echo "slave 1: expect 0 tables"
./s1 -e 'show tables from test'
echo "slave 2: expect 1 table"
./s2 -e 'show tables from test'

./use_all 'show slave status\G' | grep 'server\|Running:\|Delay:'
./s2 -v -e 'stop slave SQL_THREAD'
echo "slave 2: expect 1 table"
./s2 -vv -e 'show tables from test'
./s2 -v -e 'start slave'

Suggested fix:
make sure that STOP SLAVE effectively and immediately stops the SQL thread.
[1 Sep 2010 9:32] Giuseppe Maxia
Here's a sample run of the test:

# server: 1: 
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
# server: 2: 
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
                    SQL_Delay: 10
          SQL_Remaining_Delay: NULL

### slave 2 is delayed

--------------
create table test.t1(i int) engine=innodb
--------------

slave 1: expect 1 table
+----------------+
| Tables_in_test |
+----------------+
| t1             |
+----------------+
slave 2: expect 0 tables

### no table creation has been replicated to slave 2 yet.

sleeping 10 seconds
slave 1: expect 1 table
+----------------+
| Tables_in_test |
+----------------+
| t1             |
+----------------+
slave 2: expect 1 table
+----------------+
| Tables_in_test |
+----------------+
| t1             |
+----------------+

### after the delay, the table is replicated
### So far, so good.

--------------
drop table test.t1
--------------

slave 1: expect 0 tables

### the table is dropped on the first (non delayed) slave.

slave 2: expect 1 table
+----------------+
| Tables_in_test |
+----------------+
| t1             |
+----------------+

### the table is still alive in the delayed slave

# server: 1: 
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
# server: 2: 
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
                    SQL_Delay: 10
          SQL_Remaining_Delay: 10

### The DROP TABLE statement is still 10 seconds away

--------------
stop slave SQL_THREAD
--------------
### we want to freeze the slave.

slave 2: expect 1 table
--------------
show tables from test
--------------

Empty set (0.00 sec)

### The STOP SLAVE statement had the unpleasant side effect of rushing
### DROP TABLE to execution.

Bye
--------------
start slave
--------------
[2 Sep 2010 8:18] Sven Sandberg
Thanks, I can verify this:

--source include/master-slave.inc
--connection slave
STOP SLAVE;
CHANGE MASTER TO MASTER_DELAY = 3600;
START SLAVE;
--echo [on master]
--connection master
CREATE TABLE t1 (a INT);
SHOW TABLES IN test;
--echo [on slave]
--connection slave
--sleep 10
SHOW TABLES IN test;
STOP SLAVE;
--sleep 2
SHOW TABLES IN test;

At STOP SLAVE, the SQL thread executes the binlog event that the SQL thread waited for. The test case for WL#344 (rpl_delayed_slave.test) did not find this because it only tests DML, which in the binlog is surrounded by two binlog events containing BEGIN and COMMIT. So the BEGIN was executed, but that was not noticed because the test case only checked if the statement was executed.
[10 Sep 2010 16:47] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/118017

3285 Sven Sandberg	2010-09-10
      BUG#56442: Slave executes delayed statements when STOP SLAVE is issued
      Problem:
      When using the delayed slave feature, and the SQL thread is delaying,
      and the user issues STOP SLAVE, the event we wait for was executed.
      It should not be executed.
      Fix:
      Check the return value from the delay function,
      rpl_slave.cc:safe_sleep(). If the return value is 1, it means the thread
      has been stopped, in this case we don't execute the statement.
      
      Also, refactored the test case for delayed slave a little: added the
      test script include/rpl_assert.inc, which asserts that a condition holds
      and prints a message if not. Made rpl_delayed_slave.test use this. The
      advantage is that the test file is much easier to read and maintain,
      because it is clear what is an assertion and what is not, and also the
      expected result can be found in the test file, you don't have to compare
      it to the result file.
     @ mysql-test/extra/rpl_tests/delayed_slave_wait_on_query.inc
        Made test use assertions instead of print to the result file.
     @ mysql-test/include/rpl_assert.inc
        Added file to check test assertions.
     @ mysql-test/include/wait_for_slave_param.inc
        Made the parameter $slave_timeout be seconds, not tenths of seconds,
        because nobody uses tenths of seconds.
     @ mysql-test/suite/rpl/r/rpl_delayed_slave.result
        updated result file
     @ mysql-test/suite/rpl/t/rpl_delayed_slave.test
         - Added test for BUG#56442
         - Made test use assertions instead of print to result file
         - Removed the line SELECT STATE FROM I_S.PROCESSLIST... because it
           did not seem to test anything useful.
     @ sql/rpl_slave.cc
         - Avoid executing the waited-for event when user does STOP SLAVE,
           by checking the return value from safe_sleep.
         - Documented safe_sleep.
[13 Sep 2010 13:24] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/118082

3285 Sven Sandberg	2010-09-13
      BUG#56442: Slave executes delayed statements when STOP SLAVE is issued
      Problem:
      When using the delayed slave feature, and the SQL thread is delaying,
      and the user issues STOP SLAVE, the event we wait for was executed.
      It should not be executed.
      Fix:
      Check the return value from the delay function,
      rpl_slave.cc:safe_sleep(). If the return value is 1, it means the thread
      has been stopped, in this case we don't execute the statement.
      
      Also, refactored the test case for delayed slave a little: added the
      test script include/rpl_assert.inc, which asserts that a condition holds
      and prints a message if not. Made rpl_delayed_slave.test use this. The
      advantage is that the test file is much easier to read and maintain,
      because it is clear what is an assertion and what is not, and also the
      expected result can be found in the test file, you don't have to compare
      it to the result file.
     @ mysql-test/extra/rpl_tests/delayed_slave_wait_on_query.inc
        Made test use assertions instead of print to the result file.
     @ mysql-test/include/rpl_assert.inc
        Added file to check test assertions.
     @ mysql-test/include/wait_for_slave_param.inc
        Made the parameter $slave_timeout be seconds, not tenths of seconds,
        because nobody uses tenths of seconds.
     @ mysql-test/suite/rpl/r/rpl_delayed_slave.result
        updated result file
     @ mysql-test/suite/rpl/t/rpl_delayed_slave.test
         - Added test for BUG#56442
         - Made test use assertions instead of print to result file
         - Removed the line SELECT STATE FROM I_S.PROCESSLIST... because it
           did not seem to test anything useful.
     @ sql/rpl_slave.cc
         - Avoid executing the waited-for event when user does STOP SLAVE,
           by checking the return value from safe_sleep.
         - Documented safe_sleep.
[13 Sep 2010 15:04] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/118104

3285 Sven Sandberg	2010-09-13
      BUG#56442: Slave executes delayed statements when STOP SLAVE is issued
      Problem:
      When using the delayed slave feature, and the SQL thread is delaying,
      and the user issues STOP SLAVE, the event we wait for was executed.
      It should not be executed.
      Fix:
      Check the return value from the delay function,
      rpl_slave.cc:safe_sleep(). If the return value is 1, it means the thread
      has been stopped, in this case we don't execute the statement.
      
      Also, refactored the test case for delayed slave a little: added the
      test script include/rpl_assert.inc, which asserts that a condition holds
      and prints a message if not. Made rpl_delayed_slave.test use this. The
      advantage is that the test file is much easier to read and maintain,
      because it is clear what is an assertion and what is not, and also the
      expected result can be found in the test file, you don't have to compare
      it to the result file.
     @ mysql-test/extra/rpl_tests/delayed_slave_wait_on_query.inc
        Made test use assertions instead of print to the result file.
     @ mysql-test/include/rpl_assert.inc
        Added file to check test assertions.
     @ mysql-test/include/wait_for_slave_param.inc
        Made the parameter $slave_timeout be seconds, not tenths of seconds,
        because nobody uses tenths of seconds.
     @ mysql-test/suite/rpl/r/rpl_delayed_slave.result
        updated result file
     @ mysql-test/suite/rpl/t/rpl_delayed_slave.test
         - Added test for BUG#56442
         - Made test use assertions instead of print to result file
         - Removed the line SELECT STATE FROM I_S.PROCESSLIST... because it
           did not seem to test anything useful.
     @ sql/rpl_slave.cc
         - Avoid executing the waited-for event when user does STOP SLAVE,
           by checking the return value from safe_sleep.
         - Documented safe_sleep.
[14 Sep 2010 11:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/118192

3287 Sven Sandberg	2010-09-14
      BUG#56442: Slave executes delayed statements when STOP SLAVE is issued
      Problem:
      When using the delayed slave feature, and the SQL thread is delaying,
      and the user issues STOP SLAVE, the event we wait for was executed.
      It should not be executed.
      Fix:
      Check the return value from the delay function,
      rpl_slave.cc:safe_sleep(). If the return value is 1, it means the thread
      has been stopped, in this case we don't execute the statement.
      
      Also, refactored the test case for delayed slave a little: added the
      test script include/rpl_assert.inc, which asserts that a condition holds
      and prints a message if not. Made rpl_delayed_slave.test use this. The
      advantage is that the test file is much easier to read and maintain,
      because it is clear what is an assertion and what is not, and also the
      expected result can be found in the test file, you don't have to compare
      it to the result file.
     @ mysql-test/extra/rpl_tests/delayed_slave_wait_on_query.inc
        Made test use assertions instead of print to the result file.
     @ mysql-test/include/rpl_assert.inc
        Added file to check test assertions.
     @ mysql-test/include/wait_for_slave_param.inc
        Made the parameter $slave_timeout be seconds, not tenths of seconds,
        because nobody uses tenths of seconds.
     @ mysql-test/suite/rpl/r/rpl_delayed_slave.result
        updated result file
     @ mysql-test/suite/rpl/t/rpl_delayed_slave.test
         - Added test for BUG#56442
         - Made test use assertions instead of print to result file
         - Removed the line SELECT STATE FROM I_S.PROCESSLIST... because it
           did not seem to test anything useful.
     @ sql/rpl_slave.cc
         - Avoid executing the waited-for event when user does STOP SLAVE,
           by checking the return value from safe_sleep.
         - Documented safe_sleep.
[16 Sep 2010 11:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/118381

3290 Sven Sandberg	2010-09-16
      BUG#56442: Slave executes delayed statements when STOP SLAVE is issued
      Problem:
      When using the delayed slave feature, and the SQL thread is delaying,
      and the user issues STOP SLAVE, the event we wait for was executed.
      It should not be executed.
      Fix:
      Check the return value from the delay function,
      rpl_slave.cc:safe_sleep(). If the return value is 1, it means the thread
      has been stopped, in this case we don't execute the statement.
      
      Also, refactored the test case for delayed slave a little: added the
      test script include/rpl_assert.inc, which asserts that a condition holds
      and prints a message if not. Made rpl_delayed_slave.test use this. The
      advantage is that the test file is much easier to read and maintain,
      because it is clear what is an assertion and what is not, and also the
      expected result can be found in the test file, you don't have to compare
      it to the result file.
     @ mysql-test/extra/rpl_tests/delayed_slave_wait_on_query.inc
        Made test use assertions instead of print to the result file.
     @ mysql-test/include/rpl_assert.inc
        Added file to check test assertions.
     @ mysql-test/include/wait_for_slave_param.inc
        Made the parameter $slave_timeout be seconds, not tenths of seconds,
        because nobody uses tenths of seconds.
     @ mysql-test/suite/rpl/r/rpl_delayed_slave.result
        updated result file
     @ mysql-test/suite/rpl/t/rpl_delayed_slave.test
         - Added test for BUG#56442
         - Made test use assertions instead of print to result file
         - Removed the line SELECT STATE FROM I_S.PROCESSLIST... because it
           did not seem to test anything useful.
     @ sql/rpl_slave.cc
         - Avoid executing the waited-for event when user does STOP SLAVE,
           by checking the return value from safe_sleep.
         - Documented safe_sleep.
[16 Sep 2010 13:06] Sven Sandberg
pushed to next-mr-bugfixing
[2 Oct 2010 18:15] Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101002181053-6iotvl26uurcoryp) (version source revid:alexander.nozdrin@oracle.com-20101002180917-h0n62akupm3z20nt) (pib:21)
[13 Nov 2010 16:14] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[30 Nov 2010 22:20] Jon Stephens
Does not appear in any release; no changelog entry required. Closed.