Bug #55263 | assert in check_binlog_magic | ||
---|---|---|---|
Submitted: | 14 Jul 2010 14:54 | Modified: | 15 Nov 2010 13:19 |
Reporter: | Matthias Leich | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) |
Version: | 5.1.49-debug, 5.1.50-bzr | OS: | Any |
Assigned to: | Luis Soares | CPU Architecture: | Any |
Tags: | assert |
[14 Jul 2010 14:54]
Matthias Leich
[14 Jul 2010 15:11]
Valeriy Kravchuk
Verified on Mac OS X: valeriy-kravchuks-macbook-pro:mysql-test openxs$ ./mysql-test-run.pl --mem --mysqld=--binlog-format=statement --repeat=2 rpl_cross_version --skip-ndb --testcase-timeout=20 --suite=rpl rpl_cross_version Logging: ./mysql-test-run.pl --mem --mysqld=--binlog-format=statement --repeat=2 rpl_cross_version --skip-ndb --testcase-timeout=20 --suite=rpl rpl_cross_version ... Installing system database... Using server port 51143 ============================================================================== TEST RESULT TIME (ms) ------------------------------------------------------------ worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009 rpl.rpl_cross_version [ pass ] 892 rpl.rpl_cross_version [ fail ] Test ended at 2010-07-14 18:09:06 CURRENT_TEST: rpl.rpl_cross_version mysqltest: In included file "./include/wait_for_slave_param.inc": At line 68: Error running query 'SHOW SLAVE STATUS': 2013 Lost connection to MySQL server during query The result from queries just before the failure was: ==== Initialize ==== Setting up fake replication from MYSQL_TEST_DIR/suite/binlog/std_data/binlog_old_version_4_1.000001 ==== Test ==== start slave sql_thread; Server [mysqld.1 - pid: 74238, winpid: 74238, exit: 256] failed during test run Server log from this test: 100714 18:08:57 [Note] 'CHANGE MASTER TO executed'. Previous state master_host='dummy.localdomain', master_port='3306', master_log_file='', master_log_pos='4'. New state master_host='dummy.localdomain', master_port='3306', master_log_file='', master_log_pos='4'. 100714 18:08:57 [Note] Slave SQL thread initialized, starting replication in log 'FIRST' at position 0, relay log 'slave-relay-bin-fake.000001' position: 4 100714 18:08:57 [Warning] Statement may not be safe to log in statement format. Statement: INSERT INTO t1 VALUES (RAND() * 1000000, 'random') 100714 18:08:57 [Note] next log './slave-relay-bin.000001' is currently active Assertion failed: (((log)->pos_in_file + (size_t) (*(log)->current_pos - (log)->request_pos)) == 0), function unknown function, file log.cc, line 1744. 100714 18:08:57 - mysqld got signal 6 ; ...
[22 Jul 2010 10:50]
Luis Soares
The following test case seems to trigger the same assertion: -- source include/master-slave.inc -- source include/have_binlog_format_statement.inc -- connection slave -- source include/stop_slave.inc FLUSH LOGS; FLUSH LOGS; FLUSH LOGS; SHOW MASTER LOGS; --query_vertical SHOW SLAVE STATUS --eval change master to relay_log_file='./slave-relay-bin.000006', relay_log_pos=0; show variables like 'relay%'; START SLAVE SQL_THREAD; ## Ooopsss DBA was wrong, he meant slave-relay-bin.000004 instead of ## slave-relay-bin.000006 ## So he reverts changes introduced by slave-relay-bin.000006 and ## changes back to slave-relay-bin.000004 ## Some DELETEs and DROPs are manually done by the DBA STOP SLAVE SQL_THREAD; --eval change master to relay_log_file='./slave-relay-bin.000004', relay_log_pos=0; show variables like 'relay%'; START SLAVE SQL_THREAD; -- exit
[23 Jul 2010 14:38]
Valeriy Kravchuk
Verified that new test case from Luis fails in 5.1.50 from bzr: macbook-pro:mysql-test openxs$ ./mtr bug55263 Logging: ./mtr bug55263 100723 17:35:39 [Warning] Setting lower_case_table_names=2 because file system for /var/folders/dX/dXCzvuSlHX4Op1g-o1jIWk+++TI/-Tmp-/wpzEbOYj9Y/ is case insensitive 100723 17:35:39 [Note] Plugin 'FEDERATED' is disabled. 100723 17:35:39 [Note] Plugin 'ndbcluster' is disabled. MySQL Version 5.1.50 Checking supported features... - using ndbcluster when necessary, mysqld supports it - SSL connections supported - binaries are debug compiled Collecting tests... vardir: /Users/openxs/dbs/5.1/mysql-test/var Checking leftover processes... Removing old var directory... - WARNING: Using the 'mysql-test/var' symlink Creating var directory '/Users/openxs/dbs/5.1/mysql-test/var'... Installing system database... Using server port 59732 ============================================================================== TEST RESULT TIME (ms) ------------------------------------------------------------ worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009 main.bug55263 [ fail ] Test ended at 2010-07-23 17:35:55 CURRENT_TEST: main.bug55263 --- /Users/openxs/dbs/5.1/mysql-test/r/bug55263.result 2010-07-23 17:35:32.000000000 +0300 +++ /Users/openxs/dbs/5.1/mysql-test/r/bug55263.reject 2010-07-23 17:35:50.000000000 +0300 @@ -0,0 +1,74 @@ +stop slave; +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9; +reset master; +reset slave; +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9; +start slave; +include/stop_slave.inc +FLUSH LOGS; +FLUSH LOGS; +FLUSH LOGS; +SHOW MASTER LOGS; +Log_name File_size +slave-bin.000001 149 +slave-bin.000002 149 +slave-bin.000003 149 +slave-bin.000004 106 +SHOW SLAVE STATUS; +Slave_IO_State +Master_Host 127.0.0.1 +Master_User root +Master_Port 13000 +Connect_Retry 1 +Master_Log_File master-bin.000001 +Read_Master_Log_Pos 106 +Relay_Log_File slave-relay-bin.000003 +Relay_Log_Pos 252 +Relay_Master_Log_File master-bin.000001 +Slave_IO_Running No +Slave_SQL_Running No +Replicate_Do_DB +Replicate_Ignore_DB +Replicate_Do_Table +Replicate_Ignore_Table +Replicate_Wild_Do_Table +Replicate_Wild_Ignore_Table +Last_Errno 0 +Last_Error +Skip_Counter 0 +Exec_Master_Log_Pos 106 +Relay_Log_Space 872 +Until_Condition None +Until_Log_File +Until_Log_Pos 0 +Master_SSL_Allowed No +Master_SSL_CA_File +Master_SSL_CA_Path +Master_SSL_Cert +Master_SSL_Cipher +Master_SSL_Key +Seconds_Behind_Master NULL +Master_SSL_Verify_Server_Cert No +Last_IO_Errno 0 +Last_IO_Error +Last_SQL_Errno 0 +Last_SQL_Error +change master to relay_log_file='./slave-relay-bin.000006', relay_log_pos=0; ; +show variables like 'relay%'; +Variable_name Value +relay_log slave-relay-bin +relay_log_index +relay_log_info_file relay-log.info +relay_log_purge OFF +relay_log_space_limit 0 +START SLAVE SQL_THREAD; +STOP SLAVE SQL_THREAD; +change master to relay_log_file='./slave-relay-bin.000004', relay_log_pos=0; ; +show variables like 'relay%'; +Variable_name Value +relay_log slave-relay-bin +relay_log_index +relay_log_info_file relay-log.info +relay_log_purge OFF +relay_log_space_limit 0 +START SLAVE SQL_THREAD; mysqltest: Result length mismatch Server [mysqld.2 - pid: 93824, winpid: 93824, exit: 256] failed during test run Server log from this test: 100723 17:35:43 [Warning] '--default-character-set' is deprecated and will be removed in a future release. Please use '--character-set-server' instead. 100723 17:35:43 [Warning] 'for replication startup options' is deprecated and will be removed in a future release. Please use ''CHANGE MASTER'' instead. 100723 17:35:43 [Warning] Setting lower_case_table_names=2 because file system for /Users/openxs/dbs/5.1/mysql-test/var/mysqld.2/data/ is case insensitive 100723 17:35:43 [Note] Plugin 'FEDERATED' is disabled. 100723 17:35:43 [Note] Plugin 'InnoDB' is disabled. 100723 17:35:43 [Note] Plugin 'ndbcluster' is disabled. 100723 17:35:44 [Note] Event Scheduler: Loaded 0 events 100723 17:35:44 [Note] /Users/openxs/dbs/5.1/libexec/mysqld: ready for connections. Version: '5.1.50-debug-log' socket: '/Users/openxs/dbs/5.1/mysql-test/var/tmp/mysqld.2.sock' port: 13001 Source distribution 100723 17:35:46 [Note] Slave SQL thread initialized, starting replication in log 'FIRST' at position 0, relay log './slave-relay-bin.000001' position: 4 100723 17:35:46 [Note] Slave I/O thread: connected to master 'root@127.0.0.1:13000',replication started in log 'FIRST' at position 4 100723 17:35:46 [Note] next log './slave-relay-bin.000002' is currently active 100723 17:35:47 [Note] next log './slave-relay-bin.000003' is currently active 100723 17:35:47 [Note] Slave I/O thread killed while reading event 100723 17:35:47 [Note] Slave I/O thread exiting, read up to log 'master-bin.000001', position 106 100723 17:35:47 [Note] Error reading relay log event: slave SQL thread was killed 100723 17:35:50 [Note] 'CHANGE MASTER TO executed'. Previous state master_host='127.0.0.1', master_port='13000', master_log_file='master-bin.000001', master_log_pos='106'. New state master_host='127.0.0.1', master_port='13000', master_log_file='master-bin.000001', master_log_pos='106'. 100723 17:35:50 [Note] Slave SQL thread initialized, starting replication in log 'master-bin.000001' at position 106, relay log './slave-relay-bin.000006' position: 4 100723 17:35:50 [Note] Slave SQL thread exiting, replication stopped in log 'master-bin.000001' at position 106 100723 17:35:50 [Note] 'CHANGE MASTER TO executed'. Previous state master_host='127.0.0.1', master_port='13000', master_log_file='master-bin.000001', master_log_pos='106'. New state master_host='127.0.0.1', master_port='13000', master_log_file='master-bin.000001', master_log_pos='106'. 100723 17:35:50 [Note] Slave SQL thread initialized, starting replication in log 'master-bin.000001' at position 106, relay log './slave-relay-bin.000004' position: 4 100723 17:35:50 [Note] next log './slave-relay-bin.000005' is not active 100723 17:35:50 [Note] next log './slave-relay-bin.000006' is currently active Assertion failed: (((log)->pos_in_file + (size_t) (*(log)->current_pos - (log)->request_pos)) == 0), function unknown function, file log.cc, line 1739. 100723 17:35:50 - mysqld got signal 6 ; ...
[23 Jul 2010 16:33]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/114246 3489 Luis Soares 2010-07-23 BUG#55263: assert in check_binlog_magic The procedure for setting up a fake binary log, by changing the relay log files manually, is run twice when we issue mtr with --repeat=2. However, when setting it up for the second time, neither the sql thread is reset nor the server is restarted. This means that previous stale relay log IO_CACHE - from first run - is left around. As a consequence, when the test is run for the second time, the IO_CACHE for the relay log has position in file different than zero, triggering the assertion. We fix this by changing the call to my_b_seek at the very beginning of init_master_info. Instead of doing a seek on the cur_log we now do a seek on the currently active relay log. This makes sure that even if the SQL thread had read the log while it was hot in the past, the position will be set to 0 if it continues hot when the SQL thread is restarted and switches back to it. If, on the other hand, the log became cold while the SQL was stopped and restarted, then when the thread switches back to it the log will have to be reopened, thus the position will start from the 0, meaning that there is no need to do a seek.
[24 Sep 2010 9:46]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/119015 3515 Luis Soares 2010-09-24 BUG#55263: assert in check_binlog_magic The procedure for setting up a fake binary log, by changing the relay log files manually, is run twice when we issue mtr with --repeat=2. However, when setting it up for the second time, neither the sql thread is reset nor the server is restarted. This means that previous stale relay log IO_CACHE metadata - from first run - is left around. As a consequence, when the test is run for the second time, the IO_CACHE for the relay log has position in file different than zero, triggering the assertion. We fix this by deploying a call to my_b_seek before calling check_binlog_magic in next_event. This prevents the server from asserting, in the cases that the SQL thread was reads from a hot log (relay.NNNNN), then is stopped, then is restarted from a previous cold log (relay.MMMMM), and then it reaches the hot log relay.NNNNN again, in which case, the read coordinates are not set to zero, but to the old values. Additionally, some comments to the source code were added.
[13 Oct 2010 10:34]
Luis Soares
Queued in mysql-5.1-bugteam, mysql-5.5-bugteam and mysql-trunk-merge: - http://pb2.norway.sun.com/web.py?template=push_details&push=1647867 - http://pb2.norway.sun.com/web.py?template=push_details&push=1647840 - http://pb2.norway.sun.com/web.py?template=push_details&push=1647834
[29 Oct 2010 15:05]
Jon Stephens
Documented in the 5.1.53 and 5.5.8 changelogs as follows: When making changes to relay log settings using CHANGE MASTER TO, the IO cache was not cleared. This could result in replication failure when the slave attempted to read stale data from the cache and stopped with an assertion. Set Need Merge, waiting for push to trunk.
[4 Nov 2010 17:28]
Jon Stephens
Already documented in 5.5; waiting for merge to trunk.
[13 Nov 2010 16:24]
Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:alexander.nozdrin@oracle.com-20101113152450-2zzcm50e7i4j35v7) (merge vers: 5.6.1-m4) (pib:21)
[13 Nov 2010 16:35]
Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (pib:21)
[15 Nov 2010 13:19]
Jon Stephens
Does not appear in a 5.6 release; no additional changelog entries required. Closed.
[18 Nov 2010 15:53]
Bugs System
Pushed into mysql-5.1 5.1.54 (revid:build@mysql.com-20101118153531-693taxtxyxpt037i) (version source revid:build@mysql.com-20101118153531-693taxtxyxpt037i) (merge vers: 5.1.54) (pib:21)
[16 Dec 2010 22:29]
Bugs System
Pushed into mysql-5.5 5.5.9 (revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (version source revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (merge vers: 5.5.9) (pib:24)