| Bug #58424 | mtr ignores failing mysqltest in the presence of expect file for mysqld | ||
|---|---|---|---|
| Submitted: | 23 Nov 2010 18:20 | Modified: | 20 Dec 2010 3:08 |
| Reporter: | Luis Soares | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | Tools: MTR / mysql-test-run | Severity: | S3 (Non-critical) |
| Version: | 5.5 | OS: | Any |
| Assigned to: | Bjørn Munch | CPU Architecture: | Any |
[23 Nov 2010 19:00]
Bjørn Munch
I was able to reproduce the problem running binlog.binlog_index. What actually happens in this test is that one of the "purge" commands succeeds in stead of failing with error 2013. And then mtr picks up the crashed mysqltest *before* it sees mysqld. With verbose logging, we see: --- worker[1] > Started [mysqld.1 - pid: 11319, winpid: 11319] worker[1] > Got [mysqltest - pid: 11303, winpid: 11303, exit: 256] worker[1] > Crash was expected, file '/export/home/tmp/bm/mysql-5.5.8-solaris10-i386/mysql-test/var/tmp/mysqld.1.expect' exists --- Oops! It restarts mysqld and that seems to work since the previous mysqld has now also died. But then mtr has probably forgotten that mysqltest has died and waits for it until timeout. So the test *should* have failed: --- mysqltest: At line 130: query 'purge binary logs TO 'master-bin.000006'' succeeded - should have failed with errno 2013... --- (It varies from run to run which one failes)
[24 Nov 2010 9:24]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/124817 2952 Bjorn Munch 2010-11-24 Bug #58424 mtr ignores failing mysqltest in the presence of expect file for mysqld If mysqltest dies, mtr waits to see if mysqld dies too within 100ms But in that case, it should not care about expected crash Fix: jump past the code that checks the expect file
[27 Nov 2010 16:51]
Paul DuBois
Changes to test suite. No changelog entry needed.
[5 Dec 2010 12:38]
Bugs System
Pushed into mysql-trunk 5.6.1 (revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (version source revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (merge vers: 5.6.1) (pib:23)
[16 Dec 2010 22:26]
Bugs System
Pushed into mysql-5.5 5.5.9 (revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (version source revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (merge vers: 5.5.9) (pib:24)
[17 Dec 2010 12:49]
Bugs System
Pushed into mysql-5.1 5.1.55 (revid:georgi.kodinov@oracle.com-20101217124435-9imm43geck5u55qw) (version source revid:mats.kindahl@oracle.com-20101201193331-1c07sjno2g7m46ix) (merge vers: 5.1.55) (pib:24)

Description: Some tests that crash and restart the server can end up timing out in PB2 (solaris10 x86[_64] debug max). The most noticeable example is the test case binlog_index.test which makes heavy use of DBUG_SUICIDE + mtr ability to bring the server back up. The failure symptoms are the following: binlog.binlog_index 'mix' [ fail ] timeout after 900 seconds Test ended at 2010-11-22 18:03:16 Test case timeout after 900 seconds == /export/home/pb2/test/sb_2-2566369-1290432150.42/mysql-5.5.8-solaris10-x86_64/mysql-test/var/log/binlog_index.log == show warnings; Level Code Message Warning 1377 a problem with deleting master-bin.000001; consider examining correspondence of your binlog index file to the actual binlog files Error 1377 Fatal error during log purge reset master; # crash_purge_before_update_index flush logs; SET SESSION debug="+d,crash_purge_before_update_index"; purge binary logs TO 'master-bin.000002'; ERROR HY000: Lost connection to MySQL server during query SET @index=LOAD_FILE('MYSQLTEST_VARDIR/mysqld.1/data//master-bin.index'); SELECT @index; @index master-bin.000001 master-bin.000002 master-bin.000003 # crash_purge_non_critical_after_update_index flush logs; SET SESSION debug="+d,crash_purge_non_critical_after_update_index"; == /export/home/pb2/test/sb_2-2566369-1290432150.42/mysql-5.5.8-solaris10-x86_64/mysql-test/var/tmp/analyze-timeout-mysqld.1.err == mysqltest: Could not open connection 'default' after 500 attempts: 2002 Can't connect to local MySQL server through socket '/export/home/pb2/test/sb_2-2566369-1290432150.42/tmp/zhGPHhyrmh/mysqld.1.sock' (146) - saving '/export/home/pb2/test/sb_2-2566369-1290432150.42/mysql-5.5.8-solaris10-x86_64/mysql-test/var/log/binlog.binlog_index-mix/' to '/export/home/pb2/test/sb_2-2566369-1290432150.42/mysql-5.5.8-solaris10-x86_64/mysql-test/var/log/binlog.binlog_index-mix/' - found 'core' (0/5) Trying 'dbx' to get a backtrace sh: dbx: not found Trying 'gdb' to get a backtrace sh: gdb: not found Retrying test binlog.binlog_index, attempt(2/3)... worker[1] Trying to dump core for [mysqltest - pid: 1327, winpid: 1327, exit: 256] worker[1] Trying to dump core for [mysqld.1 - pid: 1333, winpid: 1333] Looking at the logs one finds that: 1. the server is restarted 2. mysqld has executed the offending statement (the one that causes the crash) 3. mysqltest/mtr have not logged the offending event nor the error in the test case log Execution control seems not to be returned to the test driver once the connection to mysqld is broken, suggesting that there is something not allowing mysqltest to acknowledge that the connection is no more. Finally, this happens very sporadically for main.crash_commit_before and not that sporadically for binlog.binlog_index . How to repeat: See PB2 logs: http://pb2.norway.sun.com/?action=archive_download&archive_id=2568354&pretty=please OR 1. ssh skade12 2. <debug build mysql> 3. cd mysql-test 4. while [ $? == 0 ]; do perl mtr main.crash_commit_before; done Wait until the execution hangs. Suggested fix: .