Bug #53187 RQG sometimes reports false deadlocks on windows
Submitted: 27 Apr 2010 8:01 Modified: 9 Jan 2015 16:21
Reporter: John Embretsen Email Updates:
Status: Closed Impact on me:
None 
Category:Tools: Random Query Generator Severity:S2 (Serious)
Version: OS:Windows
Assigned to: Bernt Marius Johnsen CPU Architecture:Any
Tags: rqg_pb2

[27 Apr 2010 8:01] John Embretsen
Description:
On Windows some multi-thread workloads sometimes result in the RQG reporting deadlocks, although there does not seem to be a real deadlock in effect.

RQG relevant output:

# 2010-04-27T01:22:02 Entire-server deadlock detected.
# 2010-04-27T01:23:22 Entire-server deadlock detected.
# 2010-04-27T01:23:22 gentest.pl exited with exit status 9
2010-04-27T01:23:22 [2272] runall.pl will exit with exit status 9

RQG may be declaring a deadlock situation because the CPU is saturated and no connection could be made before timing out (20s).

PB2 test: rqg_myisam_stress

Grammar: randgen/conf/engines/maria/maria_stress.yy

See http://bazaar.launchpad.net/~randgen/randgen/rqg2/annotate/347.1.3/conf/engines/maria/mari...

Philip noted about this issue:

"CPU usage is at 100% and I do not know if the abnormal slowness is a OS bug or a manifestation of a hyper-efficient OS that is capable of saturating all available CPUs in their entirety. It is also possible that it is because it is two non-native-windows applications (perl and mysql) that are battling it out, each one with its own home-grown threading mechanism."

Also, a test using the engines/tiny_inserts.yy grammar with MyISAM as storage engine shows similar symptoms on Windows. This grammar is simply:

query:
	INSERT INTO _table VALUES ( _bit );

How to repeat:
Referring to a MySQL binary basedir as %CODE%. 
"\" at end-of-line means that the line continues below.

bzr branch lp:randgen
cd randgen

perl runall.pl \ 
--mysqld=--loose-skip-safemalloc \ 
--grammar=conf/engines/maria/maria_stress.yy \ 
--reporters=Deadlock,ErrorLog,Backtrace,Shutdown \ 
--duration=600 \ 
--basedir=%CODE% \ 
--mysqld=--log-output=file \ 
--queries=100000 \ 
--engine=myisam \
--threads=10
[27 Apr 2010 8:03] John Embretsen
Output from test run where this issue was seen.

Attachment: bug53187_testlog.txt (text/plain), 53.69 KiB.

[27 Apr 2010 8:52] Philip Stoev
What happens here is that perl and mysqld between them consume 100% CPU time, leaving nothing for any other processes. (Whether this is a bug in the OS or an example of a hyper-efficient OS is not relevant.)

In particular, the windows console and the mysql client become extremely slow, meaning that it is possible that a bogus deadlock failure is reported if a new client can not be connected in time.

The problem goes away if either perl or mysqld is set to a lower-priority using the Task Manager. Maybe MySQLd.pm can be modified to start mysqld with a lower priority using the Windows API functions provided for the purpose.

Also present on Windows XP, so very likely affects all Windows versions.
[28 Apr 2010 11:54] John Embretsen
I tried running the myisam_stress test a few times, using the new RQG DBServer module for starting the server (with improved process handling on Windows) instead of using MTR.

It did not seem to help much. I have seen the same kind of deadlock result, and CPU usage is about the same.

Also, sometimes the Perl process seems to crash with no error message whatsoever. I have also seen this message:

"Free to wrong pool 9abdb00 not 96424a8 at lib/GenTest/Executor/MySQL.pm line 528."
[5 Oct 2010 10:04] Bernt Marius Johnsen
We might have lost a Thread due to some Perl problems here:

panic: corrupt saved stack index at C:/strawberry/perl/lib/threads.pm line 101.

Do you see the same on Pushbuild machines?
[21 Oct 2010 14:15] Bernt Marius Johnsen
Yes. I need that information.
[6 Nov 2010 0:06] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[9 Nov 2010 7:38] John Embretsen
Test output from possibly fake deadlock on Win2003.

Attachment: bug53187_log.txt (text/plain), 13.54 KiB.

[9 Nov 2010 7:41] John Embretsen
After 3 attempts I was able to repeat on loki06 (Windows 2003, ActiveState Perl 5.10.0), using mysql-5.5-bugteam as of Nov 08 and the RQG as of Nov 08. No Perl panic or other Perl issues observed this time (may be arbitrary). The test output has been attached. 

I will try some more while trying to get rid of the unrelated messages about not finding various binaries for debugging (cdb, tail).