Bug #51691 RQG backtraces from corefiles no longer working on unix
Submitted: 3 Mar 2010 12:06 Modified: 5 Mar 2010 7:09
Reporter: John Embretsen Email Updates:
Status: Closed Impact on me:
None 
Category:Tools: Random Query Generator Severity:S2 (Serious)
Version: OS:Any (non-windows)
Assigned to: John Embretsen CPU Architecture:Any
Tags: rqg_pb2

[3 Mar 2010 12:06] John Embretsen
Description:
From reviewing RQG logs in Pushbuild it seems that backtracing functionality based on core dumps (e.g. upon server crashes or deadlocks) no longer works on unix platforms. It works well on Windows.

Solaris:
------------------------------------------------
# 12:28:01 Server crash reported, initiating post-crash analysis...
(...)
Writing a core file
# 12:28:01 datadir is /export/home/pb2/test/sb_1-1502248-1267614267.39/mysql-5.5.99-m3-solaris10-sparc-64bit-test/vardirs/master-data/
# 12:28:01 binary is 
# 12:28:01 bindir is 
# 12:28:01 core is /export/home/pb2/test/sb_1-1502248-1267614267.39/mysql-5.5.99-m3-solaris10-sparc-64bit-test/vardirs/master-data//core
dbx: File `/export/home/pb2/test/sb_1-1502248-1267614267.39/mysql-5.5.99-m3-solaris10-sparc-64bit-test/vardirs/master-data//core' is not executable
------------------------------------------------

Linux:
------------------------------------------------
# 18:47:00 Server deadlock reported, initiating analysis...
# 18:47:00 Killing mysqld with pid 23248 with SIGHUP in order to force debug output.
# 18:47:02 Killing mysqld with pid 23248 with SIGSEGV in order to capture core.
(...)
Writing a core file
# 18:47:22 datadir is /export/home/pb2/test/sb_1-1495034-1267551763.51/mysql-5.5.99-m3-linux-i686-test/vardirs/master-data/
# 18:47:22 binary is 
# 18:47:22 bindir is 
# 18:47:22 core is /export/home/pb2/test/sb_1-1495034-1267551763.51/mysql-5.5.99-m3-linux-i686-test/vardirs/master-data//core.23248
: No such file or directory.
# 18:47:23 (no debugging symbols found)
# 18:47:23 Using host libthread_db library "/lib/libthread_db.so.1".
# 18:47:23 Core was generated by `/export/home/pb2/test/sb_1-1495034-1267551763.51/mysql-5.5.99-m3-linux-i686-tes'.
# 18:47:23 Program terminated with signal 11, Segmentation fault.
# 18:47:23 #0  0x0040a402 in __kernel_vsyscall ()
# 18:47:23 #0  0x0040a402 in __kernel_vsyscall ()
# 18:47:23 #1  0x0089f067 in ?? ()
# 18:47:23 #2  0x0000002f in ?? ()
# 18:47:23 #3  0xbfb5d958 in ?? ()
# 18:47:23 #4  0x0863d449 in ?? ()
(...)
------------------------------------------------

How to repeat:
Run the RQG with a setup that is likely to crash or deadlock the MySQL server. Currently, such a situation is likely with e.g.

a) The test rqg_info_schema against mysql-6.0-codebase-bugfixing (crash, Bug#50381)

 export CODE=/path/to/binaries
 bzr branch lp:randgen
 cd randgen
 perl ./pb2gentest.pl \
   $CODE \
   $CODE/mysql-test/vardir \
   6.0-cb-bugfix \
   rqg_info_schema

b) The test rqg_mdl_deadlock against mysql-next-4284 or mysql-next-mr (crash, Bug#51377)

 export CODE=/path/to/binaries
 bzr branch lp:randgen
 cd randgen
 perl ./pb2gentest.pl \
   $CODE \
   $CODE/mysql-test/vardir \
   6.0-cb-bugfix \
   rqg_mdl_deadlock

Suggested fix:
It looks like the RQG is not finding the correct location of the binaries. Perhaps something changed with the introduction of the cmake build system (WL#5161), which is not used in Pushbuild. Further analysis required.
[3 Mar 2010 12:11] Philip Stoev
Bernt, can you please fix this? I did not realize that the binary location has changed, and the Backtrace Reporter does employ some logic to find where mysqld is.

I hope the server management API would make this a no-brainer. It may also be a good idea to add a Hudson test for it , e.g. kill the mysqld with some signal and then grep the RQG error log, as produced by the Backtrace Validator (without the ErrorLog Validator) for handle_connections_sockets or some other basic mysqld function that is always present in the backtrace when you kill the serve.r
[3 Mar 2010 12:19] John Embretsen
Typo: In the "Suggested fix" comment, replace "is not used" with "is now used" :)
[3 Mar 2010 12:25] Bernt Marius Johnsen
Easy to fix if someone could deliver a spec on where the mysqld binariy is. It is now searched for like this (relative to basedir):

	foreach my $path (
			'bin', 'sql', 'libexec',
			'../bin', '../sql', '../libexec',
			'../sql/RelWithDebInfo', '../sql/Debug',
		) {
[3 Mar 2010 12:36] John Embretsen
I just remembered that after the initial cmake merge, the location of mysqld in Pushbuild builds had changed on unix platforms from libexec/ to sbin/.

Bernt showed me where to make the change in the RQG, so I'll apply a fix and see if it helps.
[5 Mar 2010 7:09] John Embretsen
I have just verified that stacktraces from coredumps in Pushbuild are back in business (mysql-next-4284 branch). Closing this issue.