Bug #42893 main.information_schema times out sporadically
Submitted: 16 Feb 2009 16:56 Modified: 12 Nov 2009 19:13
Reporter: Alexander Nozdrin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: General Severity:S3 (Non-critical)
Version:5.1, 5.4, 6.0 OS:Any
Assigned to: Guilhem Bichot CPU Architecture:Any
Tags: disabled, pushbuild, sporadic, test failure, timeout

[16 Feb 2009 16:56] Alexander Nozdrin
Description:
Symptoms:
--------------------------------------------------------------------
CURRENT_TEST: main.information_schema
--- C:/cygwin/home/pushbuild/pb1/pb/bzr_mysql-6.0-bugteam/551/mysql-6.0.10-alpha-pb551/mysql-test/r/information_schema.result	2009-02-16 14:18:44.000000000 +0300
+++ C:\cygwin\home\pushbuild\pb1\pb\bzr_mysql-6.0-bugteam\551\mysql-6.0.10-alpha-pb551\mysql-test\r\information_schema.reject	2009-02-16 15:18:38.343750000 +0300
@@ -1599,6 +1599,9 @@
 drop function f1;
 select * from information_schema.tables where 1=sleep(100000);
 select * from information_schema.columns where 1=sleep(100000);
+Timeout in wait_condition.inc for select count(*)=0 from information_schema.processlist
+where state='User sleep' and
+info='select * from information_schema.columns where 1=sleep(100000)'
 explain select count(*) from information_schema.tables;
 id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
 1	SIMPLE	tables	ALL	NULL	NULL	NULL	NULL	NULL	Skip_open_table; Scanned all databases
@@ -1814,3 +1817,9 @@
 routine_name	routine_type
 f1	FUNCTION
 drop function f1;
+# Timeout in wait_until_count_sessions.inc
+# Number of sessions   expected: 1   found: 2
+SHOW PROCESSLIST;
+Id	User	Host	db	Command	Time	State	Info
+2	root	localhost:4978	test	Query	0	NULL	SHOW PROCESSLIST
+20	root	localhost:1027	test	Killed	25	User sleep	select * from information_schema.columns where 1=sleep(100000)
--------------------------------------------------------------------

Reported after a failure on 2009-02-16 in 6.0-bugteam
(http://tinyurl.com/crcryt)

How to repeat:
XRef: http://tinyurl.com/bd47x9
[26 Feb 2009 9:12] Alexander Nozdrin
It's not Windows-specific, although on Windows it fails much more frequently:
http://tinyurl.com/b39bo2

XRef: http://tinyurl.com/d7jxlw
[27 Feb 2009 8:17] Matthias Leich
1. There is nothing wrong within the testscript.
2. The final problem is that the killed session did
   not disappear within the given timespan (10 seconds)
   from the processlist.
3. Tests on my box with heavy parallel load showed that the
   subtest for Bug#34166 never needed more than 2.5 seconds
   to finish.
   This means the 10 seconds per wait routine within the
   subtest (we run the core test twice) should be sufficient.
4. The bug report shows in addition a difference
   +# Timeout in wait_until_count_sessions.inc
   +# Number of sessions   expected: 1   found: 2
   +SHOW PROCESSLIST;
   +Id ... Command Time	State       Info
   +2  ... Query   0	NULL        SHOW PROCESSLIST
   +20 ... Killed  25	User sleep  select  * from 
                                    information_schema.columns
                                    where 1=sleep(100000)
   This is produced by a routine at the "END" of the test.
   So there are ~ 25 seconds between the "kill <session_id>"
   and the SHOW PROCESSLIST.

Either
- the "kill <session_id>" does not work in the situation
  met at all = The killed session will never disappear from
  the process list
- the "kill <session_id>" does not get enough "attention"
  inside the server
  IMHO it should have priority because it reduces the amount
  of resources used.

IMHO this is a server bug.
[23 Jun 2009 20:33] Guilhem Bichot
re-enabled test (but marked it experimental) in mysql-azalea to see if problem
exists with pushbuild2 like it did with pushbuild1.
[26 Jun 2009 8:49] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/77282

2805 Guilhem Bichot	2009-06-23
      re-enabling test main.information_schema, to see if
      Bug#42893 "main.information_schema times out sporadically" still happens even with pushbuild2.
      Bug is tagged experimental as it's under observation, quarantine...
     @ mysql-test/collections/default.experimental
        tag test as experimental
     @ mysql-test/r/information_schema.result
        test had been disabled for three months, so result has to be updated: proc.comment is now TEXT,
        backup.progress.*date columns were removed, 4-byte UTF8 is not in Azala.
     @ mysql-test/t/disabled.def
        enable test
[27 Jul 2009 9:32] Guilhem Bichot
So, what I see in pushbuild2 is that:
- the test is enabled in 5.1 and further versions (I had enabled it only in 5.4, but it looks like it is also enabled in 5.1)
- it times out, in 5.1 and further versions.
pushbuild2 URL is in a private post.
Either the test should be disabled again, or it should be debugged, but it is not a 5.4-specific problem.
[27 Jul 2009 10:39] Guilhem Bichot
Triage team: this test times out in 5.1 and 5.4 (see my previous posts of today, with links to pushbuild2 test failures). So far this bug was always considered as occuring only in 6.0/5.4, but we see it also happens in 5.1. I think this new fact calls for a re-triage.
[29 Jul 2009 20:44] Guilhem Bichot
I was fooled when I looked at pushbuild2 results in the previous recent posts; all timeouts there are in 5.1 and "server trunk", and "server trunk" == 5.1 (got me, I thought it was azalea).
So all timeouts are in 5.1; so the problem is fixed in azalea, where it had to be fixed according to triage, so I'm closing this bug report.
If we want to fix the problem in 5.1:
the timeouts seem to all originate in the portion of information_schema.test which tests the fix for Bug#18925; this portion accounts for 80% of the total test's time in 5.1 (disk-based run). This just seems to be two slow queries, which make disk-based testsuites time out on slow machines (all Solaris).
The reason why Azalea does not have this problem is: it does not have this test portion, which was already moved by konstantin@mysql.com-20080627154042-923m6lzk7z77lrgj into information_schema-big.test, because Konstantin had noticed the slow portion (feels like I'm doing double work here); he did this to fix a test timeout on OS X.
[29 Jul 2009 20:45] Guilhem Bichot
I will thus remove the "experimental" status of this test, next time I have a reason to push into Azalea.
[5 Aug 2009 9:16] Guilhem Bichot
I removed the "experimental" status
[5 Oct 2009 20:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85796

2886 Guilhem Bichot	2009-10-05
      Port of fix for BUG#42893 "main.information_schema times out sporadically"
      (from revision konstantin@mysql.com-20080627154042-923m6lzk7z77lrgj ). This moves the slow part
      (10 seconds over 13) into a separate big test.
[9 Oct 2009 8:48] Bugs System
Pushed into 6.0.14-alpha (revid:alik@ibmvm-20091009083208-0o0f0i9w1sq3c1kn) (version source revid:guilhem@mysql.com-20091005210234-vjlm0uj8y41x4jh8) (merge vers: 6.0.14-alpha) (pib:12)
[13 Oct 2009 13:22] Paul DuBois
Test case changes. No changelog entry needed.
[12 Nov 2009 8:22] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091110093229-0bh5hix780cyeicl) (version source revid:alik@ibmvm-20091009130916-0ijstmmz3efzx20g) (merge vers: 5.5.0-beta) (pib:13)
[12 Nov 2009 19:13] Paul DuBois
Test case changes. No changelog entry needed.