Bug #45299 Test "partition_alter4_innodb" is taking too long, timeout
Submitted: 3 Jun 2009 11:21 Modified: 23 Sep 2010 15:17
Reporter: Joerg Bruehe Email Updates:
Status: Duplicate Impact on me:
None 
Category:Tests: Server Severity:S3 (Non-critical)
Version:5.1.33, mysql-5.1-telco-7.0 OS:Solaris (Sparc)
Assigned to: CPU Architecture:Any
Tags: experimental, pb2, sporadic, test failure

[3 Jun 2009 11:21] Joerg Bruehe
Description:
See these results for test "partition_alter4_innodb" during the build of cluster-7.0.6:

linux-x86                     cluster-com-pro   [ pass ]  134798
linux-x86                     cluster-gpl       [ pass ]  137969
solaris9-x86                  cluster-gpl       [ pass ]  138519
linux-x86_64                  cluster-gpl       [ pass ]  139931
solaris9-x86-pkgadd           cluster-gpl       [ pass ]  141058
linux-x86_64                  cluster-com-pro   [ pass ]  141731
solaris9-x86-pkgadd           cluster-com-pro   [ pass ]  142952
solaris9-x86                  cluster-com-pro   [ pass ]  145480
solaris10-x86_64-pkgadd       cluster-gpl       [ pass ]  151134
solaris10-x86                 cluster-com-pro   [ pass ]  156032
solaris10-x86-pkgadd          cluster-gpl       [ pass ]  157017
solaris10-x86-pkgadd          cluster-com-pro   [ pass ]  159696
solaris10-x86_64-pkgadd       cluster-com-pro   [ pass ]  165567
solaris10-x86                 cluster-gpl       [ pass ]  166632
solaris10-x86_64              cluster-com-pro   [ pass ]  195942
solaris10-x86_64              cluster-gpl       [ pass ]  202609
osx10.5-x86_64                cluster-com-pro   [ pass ]  471552
osx10.5-x86_64                cluster-gpl       [ pass ]  475494
osx10.5-x86                   cluster-com-pro   [ pass ]  521645
osx10.5-x86                   cluster-gpl       [ pass ]  564001
winx64                        cluster-gpl       [ pass ]  675312
win32                         cluster-com-pro   [ pass ]  678560
winx64                        cluster-com-pro   [ pass ]  686960
solaris10-sparc-64bit         cluster-com-pro   [ pass ]  764508
solaris9-sparc-32bit-pkgadd   cluster-gpl       [ pass ]  820507
solaris10-sparc-64bit-pkgadd  cluster-com-pro   [ pass ]  824452
solaris9-sparc-32bit          cluster-gpl       [ pass ]  826500
solaris10-sparc-64bit         cluster-gpl       [ pass ]  843598
solaris10-sparc-64bit-pkgadd  cluster-gpl       [ pass ]  857785
solaris9-sparc-64bit          cluster-gpl       [ pass ]  866625
solaris9-sparc-64bit-pkgadd   cluster-gpl       [ pass ]  869951
solaris9-sparc-64bit          cluster-com-pro   [ fail ]  timeout after 900 seconds
solaris9-sparc-64bit-pkgadd   cluster-com-pro   [ fail ]  timeout after 900 seconds
solaris9-sparc-32bit          cluster-com-pro   [ fail ]  timeout after 900 seconds
solaris9-sparc-32bit-pkgadd   cluster-com-pro   [ fail ]  timeout after 900 seconds
solaris10-sparc-32bit         cluster-com-pro   [ fail ]  timeout after 900 seconds
solaris10-sparc-32bit         cluster-gpl       [ fail ]  timeout after 900 seconds
solaris10-sparc-32bit-pkgadd  cluster-com-pro   [ fail ]  timeout after 900 seconds
solaris10-sparc-32bit-pkgadd  cluster-gpl       [ fail ]  timeout after 900 seconds

As the times are in milliseconds, it is quite obvious that the test is in high danger of timeouts on the Solaris-Sparc platforms (and on these only).
It is not a general Solaris issue, on x86 and x86_64 the test is six times faster.

In the general server, this test typically took (on Solaris Sparc)
- between 470 and 600 seconds in 5.1.30,
- between 660 and 750 seconds in 5.1.31,
- between 660 and 850 seconds in 5.1.31sp1,
- between 700 and 850 seconds in 5.1.32,
- upwards of 700 seconds, and timeouts, in 5.1.33,
- upwards of 700 seconds, and timeouts, in 5.1.34,
so there is a clear tendency of rising times.

Before 5.1.33, timeouts are very rare;
from 5.1.33, they happen in roughly 1 of 2 runs on any Solaris-Sparc.

I have not analyzed whether this is because of changes in the test or in the code;
I am not aware of changes in the build options (or on the hosts) that would explain it.

How to repeat:
Just try it ...

Suggested fix:
Check whether the test can be shortened to avoid these timeouts.
[8 Jun 2009 10:07] Sveta Smirnova
Thank you for the report.

Verified as described on 64-bit Solaris 10 SPARC box
[12 Aug 2009 12:45] Joerg Bruehe
Runtime statistics 5.1.29-rc .. 5.1.34 on Linux + Solaris (PDF)

Attachment: times-45299.pdf (application/pdf, text), 62.86 KiB.

[12 Aug 2009 12:46] Joerg Bruehe
raw data for runtime statistics (OOc Calc)

Attachment: times-45299.ods (application/vnd.oasis.opendocument.spreadsheet, text), 26.60 KiB.

[12 Aug 2009 12:54] Joerg Bruehe
See the statistics attached, this is all I can provide for analysis.

The runtime increase from 5.1.30 to 5.1.31 is (at least partly) due to a build tool change, with 5.1.31 we started to have two (2) parallel builds on Solaris:
one for the tarballs, the other for PKG format.

Note there is no increase on Linux, a visible but moderate on Solaris-Intel, and a big one (50 %) on Solaris-Sparc.

The choice of 910 s (910,000 ms) for a run that hit the timeout (900 s) is of course arbitrary.

I unassign myself, I cannot proceed from here.
For any further analysis, development should define some reference test, and I can then help run it on Solaris and other platforms.
[13 Aug 2009 6:21] Sveta Smirnova
Re-set it as "Verified" as all feedback was provided.
[19 Jul 2010 10:48] Alfranio Tavares Correia Junior
The test also fails in other platforms. See http://tinyurl.com/2vyyqzj
[23 Sep 2010 15:17] Mattias Jonsson
Since removing test is against our policy, I'm added --big_test flag as requirement to run this test, and also added a line in mysql-test/collations/default.weekly that run with --big-test flag and increased timeout, so that this test is run on every weekly run (together with some other big tests). Also see bug#56659.

Closing as duplicate of bug#56659.
[9 Nov 2010 19:44] Bugs System
Pushed into mysql-5.5 5.5.7-rc (revid:sunanda.menon@sun.com-20101109182959-otkxq8vo2dcd13la) (version source revid:sunanda.menon@sun.com-20101109182959-otkxq8vo2dcd13la) (merge vers: 5.5.7-rc) (pib:21)
[13 Nov 2010 16:18] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:alexander.nozdrin@oracle.com-20101113152450-2zzcm50e7i4j35v7) (merge vers: 5.6.1-m4) (pib:21)
[13 Nov 2010 16:32] Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (pib:21)