Bug #57720 InnoDB crashes at startup in Windows Vista/7 in os0sync.c; os_cond_wait_timed()
Submitted: 25 Oct 2010 20:32 Modified: 14 Dec 2010 19:17
Reporter: Kevin Lewis Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:mysql-5.5-innodb OS:Microsoft Windows (Vista & 7)
Assigned to: Kevin Lewis CPU Architecture:Any

[25 Oct 2010 20:32] Kevin Lewis
Description:
Due to a recent change to add a 1 second timeout to the lower level use of CONDITION_VARIABLE on Windows, an assert is being hit.  The assert is that an error other than WAIT_TIMEOUT is returned from SleepConditionVariableCS() on Windows Vista.  Note that I reproduced this within a Vista VM running on MacOS 10.5.

The error returned from get_last_error() is ERROR_TIMEOUT.  From http://msdn.microsoft.com/en-us/library/ms686301%28VS.85%29.aspx, "Condition variables are subject to spurious wakeups (those not associated with an explicit wake) and stolen wakeups (another thread manages to run before the woken thread)." This ERROR_TIMEOUT must be a result of a 'spurious' timeout.  As such, it should be handled just like a normal timeout.

How to repeat:
mysql-test-run --suite=innodb  on Windows Vista with mysql-5.5-innodb.

Suggested fix:
In os0sync.c; os_cond_wait_timed()

	ret = sleep_condition_variable(cond, mutex, time_in_ms);

+	if (!ret) {
+		last_error = GetLastError();
+		if (last_error == WAIT_TIMEOUT) {
+			return(TRUE);
+		}
+		/* From http://msdn.microsoft.com/en-us/library/ms686301%28VS.85%29.aspx,
+		"Condition variables are subject to spurious wakeups
+		(those not associated with an explicit wake) and stolen wakeups 
+		(another thread manages to run before the woken thread)." 
+		Act like it was a real timeout.  Conditions are checked by the caller.*/
+		if (last_error == ERROR_TIMEOUT) {
+			return(TRUE);
+		}
+	}

Or the long explanation could be skipped and ERROR_TIMEOUT could be checked on the same line as WAIT_TIMEOUT.
[26 Oct 2010 19:28] Kevin Lewis
patch for mysql-5.5-innodb

Attachment: Bug57720.patch (application/octet-stream, text), 2.30 KiB.

[26 Oct 2010 21:23] Kevin Lewis
pushed to mysql-5.5-innodb
[13 Nov 2010 16:05] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:alexander.nozdrin@oracle.com-20101113152450-2zzcm50e7i4j35v7) (merge vers: 5.6.1-m4) (pib:21)
[13 Nov 2010 16:30] Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:alexander.nozdrin@oracle.com-20101113152540-gxro4g0v29l27f5x) (pib:21)
[1 Dec 2010 2:11] Kevin Lewis
I am not sure what conditions would cause the ERROR_TIMEOUT instead of WAIT_TIMEOUT.  In all the testing I did on Vista the ERROR_TIMEOUT seemed to occur consistently instead of WAIT_TIMEOUT when returned from the newly introduced sleep_condition_variable() call.  And this new code originally only had a test for WAIT_TIMEOUT, with an assert, or crash, for any other error. So it probably was not limited to debug code.

Note that this was a fix for a patch in which Vlad introduced the use of CONDITION_VARIABLE on Windows Vista and Windows 7 into the 5.5 codebase.  I think Calvin checked the patch into InnoDB.  I do not believe that this bug was ever released though.  It probably only existed in the 5.5 branch for a few weeks.
[2 Dec 2010 19:19] Calvin Sun
John - yes, it is a crash at startup; and not debug binary only.
[14 Dec 2010 19:17] John Russell
Added to changelog:

The server could stop with an assertion error on Windows Vista and Windows 7 systems.
[16 Dec 2010 22:26] Bugs System
Pushed into mysql-5.5 5.5.9 (revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (version source revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (merge vers: 5.5.9) (pib:24)