Bug #52528 | Protocol desynchronisation between client and server on a DDL workload | ||
---|---|---|---|
Submitted: | 1 Apr 2010 15:17 | Modified: | 12 Jul 2010 8:04 |
Reporter: | Philip Stoev | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | mysql-trunk,mysql-trunk-runtime-exp, 5.1, next-mr | OS: | Any |
Assigned to: | Assigned Account | CPU Architecture: | Any |
[1 Apr 2010 15:17]
Philip Stoev
[1 Apr 2010 15:46]
Philip Stoev
Last queries from the thread that handged: 152 Query SET SESSION SQL_MODE='' 152 Query SAVEPOINT A 152 Query SAVEPOINT B 152 Query FLUSH TABLE testdb_S . t1_temp1_S 152 Query COMMIT WORK 152 Query UNLOCK TABLES 152 Query FLUSH TABLE testdb_S . t1_view1_N 152 Query ROLLBACK WORK 152 Query COMMIT WORK NO RELEASE 152 Query CREATE TRIGGER testdb_N . tr1_1_N BEFORE DELETE ON t1_base2_N FOR EACH ROW BEGIN REPLACE testdb_N . t1_temp2_N ( `col_int` ) VALUES ( 6 ) ; END 152 Query SELECT * FROM test.table1_int_autoinc INTO OUTFILE '/tmp/gentest18505.tmp' 152 Query LOAD DATA INFILE '/tmp/gentest18505.tmp' INTO TABLE testdb_S . t1_temp1_S 152 Query START TRANSACTION 152 Query SELECT * FROM test.table1_int_autoinc INTO OUTFILE '/tmp/gentest18505.tmp' 152 Query LOAD DATA INFILE '/tmp/gentest18505.tmp' INTO TABLE testdb_N . t1_base1_N 152 Query /* Sequence start */ CREATE FUNCTION testdb_S . f1_2_S () RETURNS INTEGER RETURN ( SELECT MOD( COUNT( DISTINCT `pk` ) , 10 ) FROM testdb_S . t1_view2_S ) ; COMMIT ; SELECT SLEEP( 0.5 * 0.94 * 1 ) ; DROP FUNCTION testdb_S . f1_2_S /* Sequence end */ core,binary and logs: http://mysql-systemqa.s3.amazonaws.com/var-bug52528.zip source: revision-id: kostja@sun.com-20100327120632-xvhcrtay402f8x52 date: 2010-03-27 15:06:32 +0300 build-date: 2010-04-01 18:41:13 +0300 revno: 3138 branch-nick: mysql-trunk-runtime-exp
[13 Apr 2010 10:02]
Philip Stoev
Unfortunately this issue does not lend itself to automatic simplification, even if the RQG command line above reproduces it easily. It is possible that KILL and/or COMMIT|ROLLBACK RELEASE, which cause the RQG to establish a fresh connection, are required to reproduce.
[13 Apr 2010 15:40]
Sveta Smirnova
Bug #48515 can be duplicate of this one.
[21 Apr 2010 8:56]
Sveta Smirnova
See also bug #50399 Bug #19610 was marked as duplicate of this one.
[11 Jul 2010 20:56]
Elena Stepanova
After running some additional tests, I am quite certain that Philip´s initial RQG test and my MTR test used to hit the same problem: when an idle connection got killed, it did not disappear but stayed in 'Killed' forever. The RQG grammar contains KILL QUERY and KILL statements. In a number of failures that I checked, a hanging connection <ID> always appeared after some other connection(s) ran KILL QUERY <ID> followed by KILL <ID>. The race condition apparently is that at the moment of 'KILL' the previous query has to be already killed, but a new one should not start yet. The MTR test was initially designed to kill an idle thread, so it failed deterministically. I have modified it a little to avoid confusion about pass/fail, and to remove explicit sleeps. It can be found below as test case 'Kill Idle'. Both tests fail on 5.5 cmake builds before the change http://lists.mysql.com/commits/104251, and do not fail on builds which contain this change (mid April, depending on the tree). Bug#19610, which is marked as a duplicate of this one, initially reports two part of the problem: first, idle connection do not disappear after wait/interactive_timeout, and second, these connections, while killed manually get stuck in 'Killed'. The second part is exactly the same as we observed. However, while running tests on a SUSE 10.2 box, I could not get the failure on *normal* 5.1 builds, neither on source code from March-April, nor on the current trees. Both the life-after-timeout and the stay-in-killed problems are easily reproducible if I compile 5.1 explicitly with NO_ALARM (and disappear again if I add SIGNAL_WITH_VIO_CLOSE); I'm not sure however it is a valid experiment. Just in case, the test which shows both issues on a 5.1 "NO_ALARM" build can be found below as 'Kill Timed Out' test case. #--------------------- # Kill Idle test case # Should fail on a 5.5 cmake build before http://lists.mysql.com/commits/104251 # and pass after the change connect(con1,localhost,root,,test,,); let $con1_id = `SELECT CONNECTION_ID()`; connection default; let $wait_condition = SELECT command = 'Sleep' FROM information_schema.processlist WHERE id = $con1_id; --source include/wait_condition.inc eval KILL $con1_id; let $wait_timeout = 5; let $wait_condition = SELECT command = 'Killed' AND time > 3 FROM information_schema.processlist WHERE id = $con1_id; --source include/wait_condition.inc if ($success) { SHOW PROCESSLIST; --die # Hanging connection detected } --echo # No hanging connections detected --exit # EOF (Kill Idle test case) #--------------------- # Kill Timed Out test case connect(con1,localhost,root,,test,,); SET @@wait_timeout=3; SET @@interactive_timeout=3; let $con1_id = `SELECT CONNECTION_ID()`; connection default; let $wait_timeout=6; let $wait_condition = SELECT time > 4 FROM information_schema.processlist WHERE id = $con1_id; --source include/wait_condition.inc if (!$success) { --echo # No timed out connections detected, nothing to kill --exit } --echo # Timed out connection detected: SHOW PROCESSLIST; eval KILL $con1_id; let $wait_timeout = 3; let $wait_condition = SELECT time > 6 AND command = 'Killed' FROM information_schema.processlist WHERE id = $con1_id; --source include/wait_condition.inc if (!$success) { --echo # No hanging connections detected --die # Timed out connection, but no hang up } SHOW PROCESSLIST; --die # Timed out and hanging connection detected # EOF (Kill Timed Out test case)
[12 Jul 2010 8:04]
Dmitry Shulga
This is a duplicate of Bug#37780