Bug #25285 mysqld got signal 6 when try runing the ./mysql-test-run script
Submitted: 26 Dec 2006 16:05 Modified: 18 May 2007 16:55
Reporter: Joerg Behrens Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: Tests Severity:S7 (Test Cases)
Version:5.0.27 OS:Other (IRIX 6.5.30)
Assigned to: Magnus Blåudd CPU Architecture:Any
Tags: IRIX, Tests

[26 Dec 2006 16:05] Joerg Behrens
Description:
Testing a recent mysql build from the source with the "mysql-test-run" script causes crashing and restarting of the mysqld.

It doesnt matter if "./mysql-test-run alias" or "./mysql-test-run --extern alias" was used and it can be easily re-produced on this machine.
When using the script with the --extern option a already startet mysqld, which uses a configuration based on my-small.cnf, was tested.

 
[o2k]:/tmp/mysql5/mysql-test $ ./mysql-test-run alias
Logging: ./mysql-test-run alias
Installing Test Databases
Removing Stale Files
Installing Master Databases
running  ../libexec/mysqld --no-defaults --bootstrap --skip-grant-tables     --basedir=.. --datadir=mysql-test/var/master-data --skip-innodb --skip-ndbcluster --skip-bdb     
Installing Slave Databases
running  ../libexec/mysqld --no-defaults --bootstrap --skip-grant-tables     --basedir=.. --datadir=mysql-test/var/slave-data --skip-innodb --skip-ndbcluster --skip-bdb     
Manager disabled, skipping manager start.
Loading Standard Test Databases
Starting Tests

TEST                            RESULT
-------------------------------------------------------
alias                          [ fail ]

Errors are (from /tmp/mysql5/mysql-test/var/log/mysqltest-time) :
mysqltest: At line 64: query 'SELECT ELT(FIELD(kundentyp,'PP','PPA','PG','PGA','FK','FKA','FP','FPA','K','KA','V','VA',''), 'Privat (Private Nutzung)','Privat (Private Nutzung) Sitz im Ausland','Privat (geschaeftliche Nutzung)','Privat (geschaeftliche Nutzung) Sitz im Ausland','Firma (Kapitalgesellschaft)','Firma (Kapitalgesellschaft) Sitz im Ausland','Firma (Personengesellschaft)','Firma (Personengesellschaft) Sitz im Ausland','oeff. rechtl. Koerperschaft','oeff. rechtl. Koerperschaft Sitz im Ausland','Eingetragener Verein','Eingetragener Verein Sitz im Ausland','Typ unbekannt') AS Kundentyp ,kategorie FROM t1 WHERE hdl_nr < 2000000 AND kategorie IN ('Prepaid','Mobilfunk') AND st_klasse = 'Workflow' GROUP BY kundentyp ORDER BY kategorie' failed: 2013: Lost connection to MySQL server during query
(the last lines may be the most important ones)

Aborting: alias failed in default mode. To continue, re-run with '--force'.

Ending Tests
Shutting-down MySQL daemon

master not cooperating with mysqladmin, will try manual kill
kill: 1222016: no such process
master refused to die. Sending SIGKILL
kill: 1222016: no such process
Master shutdown finished
Slave shutdown finished

The file "var/log/master.log" showing the following entry:
[o2k]:/tmp/mysql5/mysql-test $ cat var/log/master.err 
CURRENT_TEST: alias
InnoDB: The first specified data file ./ibdata1 did not exist:
InnoDB: a new database to be created!
061226 12:29:00  InnoDB: Setting file ./ibdata1 size to 128 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Progress in MB: 100
061226 12:29:05  InnoDB: Log file ./ib_logfile0 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile0 size to 5 MB
InnoDB: Database physically writes the file full: wait...
061226 12:29:05  InnoDB: Log file ./ib_logfile1 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile1 size to 5 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Doublewrite buffer not found: creating new
InnoDB: Doublewrite buffer created
InnoDB: Creating foreign key constraint system tables
InnoDB: Foreign key constraint system tables created
061226 12:29:06  InnoDB: Started; log sequence number 0 0
061226 12:29:06 [Note] /tmp/mysql5/libexec/mysqld: ready for connections.
Version: '5.0.27-log'  socket: '/tmp/mysql5/mysql-test/var/tmp/master.sock'  port: 9306  Source distribution
mysqld got signal 6;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=1
max_connections=100
threads_connected=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 39423 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Writing a core file
/bin/sh: 1222016 Abort(coredump)

I have attached a backtrace from a core file which was created under "var/master-data/core"

Testing a older 5.0.24 version shows the same behaviour. Also a old mysql 4.1.15  version reports errors. But when runing all of the tests it crashed much later with a slightly different message.

From 4.1.15:
[...]
func_compress                  [ pass ]   
func_concat                    [ pass ]   
func_crypt                     [ fail ]

Errors are (from /usr/nekoware/mysql4/mysql-test/var/log/mysqltest-time) :
mysqltest: At line 7: query 'select length(encrypt('foo', 'ff')) <> 0' failed: 2013: Lost connection to MySQL server during query
(the last lines may be the most important ones)

Re-run the script with "./mysql-test-run func_crypt" reports a similar error. The master.err brings a "signal 11"!?! message now.

How to repeat:
Compile latest mysql 5.0.27 under IRIX 6.5.30 with the mipspro 7.4.4m compiler. Environment was based on BUILD/compile-irix-mips64-mipspro with the exception that a 32bit version of mysql was created. Also the configure was reduced to a minimum.

[o2k]:/ $ uname -Ra
IRIX64 o2k 6.5 6.5.30m 07202013 IP27
[o2k]:/ $ cc -version
MIPSpro Compilers: Version 7.4.4m

export CC=c99
export CFLAGS='-O3 -mips4  -I/usr/nekoware/include -OPT:Olimit=0:roundoff=3  -TARG:platform=IP27:proc=r10000'
export CXXFLAGS="$CFLAGS -LANG:exceptions=OFF -LANG:std=OFF -LANG:libc_in_namespace_std=OFF"
export CPPFLAGS='-I/usr/nekoware/include'
export CXX=CC
export F77=f77
export LDFLAGS='-L/usr/nekoware/lib -Wl,-rpath -Wl,/usr/nekoware/lib'
export PKG_CONFIG=/usr/nekoware/bin/pkg-config
export PKG_CONFIG_PATH='/usr/nekoware/lib/pkgconfig'
export PKG_CONFIG_LIBDIR='/usr/nekoware/lib'
export LD_LIBRARYN32_PATH='/usr/nekoware/lib'
export GNUMAKE='/usr/nekoware/bin/make'
export PATH=/usr/nekoware/bin:$PATH

./configure --with-extra-charsets=complex --enable-thread-safe-client --with-unix-socket-path=/usr/nekoware/var/run/mysql5/mysql.sock --without-extra-tools --disable-dependency-tracking --without-readline --prefix=/tmp/mysql5
gmake -j30 && gmake install
chcap "CAP_SCHED_MGT+epi" /tmp/mysql5/libexec/mysqld
/tmp/mysql5/bin/mysql_install_db
[26 Dec 2006 16:06] Joerg Behrens
Backtrace

Attachment: mysql.backtrace (application/octet-stream, text), 4.21 KiB.

[26 Dec 2006 23:33] Sveta Smirnova
Thank you for the report.

Verified as described on 64-bit IRIX
[7 Feb 2007 9:06] Magnus Blåudd
Hmm, the output from that debugger was confusing.

After compiling with debug I could trace and it shows that the abort occurs when doing an array delete in TMP_TABLE_PARAM::cleanup, so we need to perform a select that creates a tmp table for this to occur.

T@65545: | | | | | | <change_to_use_tmp_fields
T@65545: | | | | | | >JOIN::join_free
T@65545: | | | | | | | >JOIN::cleanup
T@65545: | | | | | | | | >free_io_cache
T@65545: | | | | | | | | <free_io_cache
T@65545: | | | | | | | | info: 1
T@65545: | | | | | | | | info: 2
T@65545: | | | | | | | | >close_cached_file
T@65545: | | | | | | | | <close_cached_file
T@65545: | | | | | | | | >my_free
T@65545: | | | | | | | | | my: ptr: 0x0
T@65545: | | | | | | | | <my_free
T@65545: | | | | | | | | >ha_index_end
T@65545: | | | | | | | | <ha_index_end
T@65545: | | | | | | | | >mi_extra
T@65545: | | | | | | | | | enter: function: 4
T@65545: | | | | | | | | <mi_extra
T@65545: | | | | | | | | info: 4
T@65545: | | | | | | | | info: 5
T@65545: | | | | | | | | info: 6
T@65545: | | | | | | | | info: 7
T@65545: | | | | | | | | info: 9
T@65545: | | | | | | | | >cleanup
T@65545: | | | | | | | | | info: copy_field: 12bb54d0
T@65545: | | | | | | | | | info: deleting copy_field

The function looks like this:
sql_class.h >>
  inline void cleanup(void)
  {
    if (copy_field)				/* Fix for Intel compiler */
    {
      delete [] copy_field; << Crash here
      save_copy_field= copy_field= 0;
    }
  }
<<
[7 Feb 2007 9:10] Magnus Blåudd
The copy_field array is created in at least two different places, one that creates it in the->mem_root(with a comment about that) and one that does not.

Occurence 1:
  /* Copy_field belongs to TMP_TABLE_PARAM, allocate it in THD mem_root */
  if (!(param->copy_field= copy= new (thd->mem_root) Copy_field[field_count]))
  {

2:
  if (param->field_count && 
      !(copy=param->copy_field= new Copy_field[param->field_count]))
    goto err2;
[7 Feb 2007 13:20] Magnus Blåudd
Removing the "delete [] copy_field" will "fix" the problem. Will see if we might actually have this on all platforms, would be strange otherwise.
[7 Feb 2007 14:31] Magnus Blåudd
Removing the "delete[]" and running with valgrind does not show a memory leak. That indicates that since copy_field is allocated on thd->mem_root it's freed automatically and the call to "delete[]" is not necessary
[7 Feb 2007 18:58] Pete Plank
Just a note that I've confirmed that removing 'delete [] copy_field' has resolved the crashing problems I've experienced under IRIX since 'mysql-5.0.12-beta'. I now have an IRIX build of mysql-5.0.33 up and running in a web environment without issue; this was not possible previously. My build system is running IRIX64 6.5.30/MIPSpro 7.4.4m.
[8 Feb 2007 10:22] Magnus Blåudd
Yeah! 

I ran our whole test suite under valgrind yesterday with that fix and it shows some really complicated queries where we allocate the "copy_field" a second time the first allocation will be lost. Haven't yet figured out wheter we can detect this case and only "delete[]" in that case. Need to investigate more.

But it's a minor and not very probable meory leak.
[23 Mar 2007 7:59] Pete Plank
Confirmed patch working correctly on mysql-5.0.37 under IRIX 6.5.30/MIPSPro 7.4.4m.
[7 May 2007 18:45] Magnus Blåudd
Starting to get pretty sure that we fail to invoike sql_alloc's operator delete[]

Thread 0x10002
<snip>
   10 abort(0x15, 0x4, 0xffff, 0x0, 0x11148b70, 0xa800000020196c38, 0xe0a1, 0x0)
 ["/xlv46/6.5.25m/work/irix/lib/libc/libc_64_M4/gen/abort.c":44, 0xda27db0]
   11 ::_array_pointer_not_from_vec_new(0x15, 0x4, 0xffff, 0x0, 0x11148b70, 0xa8
00000020196c38, 0xe0a1, 0x0) ["/j10/mtibuild/v741m/workarea/v7.4.1m/libC/lang_su
pport/vec_newdel.cxx":862, 0x32eb68c]
   12 ::__array_delete_general2(0x111305d0, 0xffffffffffffffff, 0xffff, 0x103e73
10, 0x11148b70, 0xa800000020196c38, 0xe0a1, 0x0) ["/j10/mtibuild/v741m/workarea/
v7.4.1m/libC/lang_support/vec_newdel.cxx":342, 0x32eb3ac]
   13 ::__array_delete2(0x15, 0x4, 0xffff, 0x0, 0x11148b70, 0xa800000020196c38, 
0xe0a1, 0x0) ["/j10/mtibuild/v741m/workarea/v7.4.1m/libC/lang_support/vec_newdel
.cxx":762, 0x32eb558]
   14 TMP_TABLE_PARAM::cleanup(void)(this = 0x1112e320) ["/usr/people/mysqldev/m
y50-bug25285-octane2/sql/sql_class.h":2034, 0x103f18d0]
   15 JOIN::cleanup(bool)(this = 0x1112d238, full = true (1)) ["/usr/people/mysq
ldev/my50-bug25285-octane2/sql/sql_select.cc":6403, 0x103fb064]
   16 JOIN::join_free(void)(this = 0x1112d238) ["/usr/people/mysqldev/my50-bug25
285-octane2/sql/sql_select.cc":6279, 0x103fa9c8]
   <snip>
[8 May 2007 15:02] Magnus Blåudd
Test case for reproducing segfault with placement array new

Attachment: test_new_delete.cc (text/x-c++src), 4.50 KiB.

[8 May 2007 15:36] Magnus Blåudd
Uploaded test case that shows the use of class scope new and delete operations. The program will crash in the same way as mysqld does when compiled with MIPSpro 7.41.

According to information in release notes http://sgi.tuwien.ac.at/relnotes/2nd.cgi/irix/MIPSpro_C++_Comp_7.4/relnotes/c++_fe there is a bug with number "862202 Delete of array allocated with new(nothrow)[]." that shows the exact same symptoms. That bug appears not to have been fixed but rather a workaround is avvailable that involves removing the destructor of the class that defines the "operator new/delete".
[8 May 2007 15:37] Magnus Blåudd
If anyone has a newer compiler please test and file a bug report with SGI using the above test program.
[8 May 2007 15:46] Joerg Behrens
[o3k]:~ $ CC -o mysqltest test_new_delete.cc
cc-3970 CC: WARNING File = test_new_delete.cc, Line = 21
  conversion from pointer to same-sized integral type (potential portability
          problem)

        printf("  allocated pool: 0x%lx\n", (unsigned long)m_data);
                                            ^

cc-3333 CC: WARNING File = test_new_delete.cc, Line = 77
  Support for placement delete is disabled.

    static void operator delete(void *ptr, void *root)
                ^
and so on.... Dont know what "Support for placement delete is disabled" means. I never seen this before.

[o3k]:~ $ ./mysqltest
Base::Base, this: 0x7ffeb994
[...]
Base::Base, this: 0x7ffeba59
Abort (core dumped)

[o3k]:~ $ dbx ./mysqltest core
dbx version 7.3.7 (96015_Nov16 MR) Nov 16 2004 07:34:16
Core from signal SIGABRT: Abort (see abort(3c))
(dbx) where
>  0 _kill(0x1afba8, 0x6, 0x85ef0, 0x0, 0x0, 0xfb56c34, 0x0, 0x0) ["/xlv41/6.5.30m/work/irix/lib/libc/libc_n32_M4/signal/kill.s":15, 0xfa54418]
   1 _raise(0x1afba8, 0x6, 0x85ef0, 0x0, 0x0, 0xfb56c34, 0x0, 0x0) ["/xlv41/6.5.30m/work/irix/lib/libc/libc_n32_M4/signal/raise.c":27, 0xfad1f6c]
   2 abort(0x1afba8, 0x6, 0x85ef0, 0x0, 0x0, 0xfb56c34, 0x0, 0x0) ["/xlv41/6.5.30m/work/irix/lib/libc/libc_n32_M4/gen/abort.c":52, 0xfa6f6b0]
   3 ::_array_pointer_not_from_vec_new(0x1afba8, 0x6, 0x85ef0, 0x0, 0x0, 0xfb56c34, 0x0, 0x0) ["/j7/mtibuild/v744/workarea/v7.4.4m/libC/lang_support/vec_newdel.cxx":862, 0xace729c]
   4 ::__array_delete_general2(0x7ffeba50, 0xffffffff, 0x85ef0, 0x100028e0, 0x0, 0xfb56c34, 0x0, 0x0) ["/j7/mtibuild/v744/workarea/v7.4.4m/libC/lang_support/vec_newdel.cxx":342, 0xace6fbc]
   5 ::__array_delete2(0x1afba8, 0x6, 0x85ef0, 0x0, 0x0, 0xfb56c34, 0x0, 0x0) ["/j7/mtibuild/v744/workarea/v7.4.4m/libC/lang_support/vec_newdel.cxx":762, 0xace7168]
   6 ::main(0x1afba8, 0x7ffeb994, 0x85ef0, 0x0, 0x0, 0xfb56c34, 0x0, 0x0) ["/usr/people/beh/test_new_delete.cc":197, 0x10001e5c]
   7 __start() ["/xlv55/kudzu-apr12/work/irix/lib/libc/libc_n32_M4/csu/crt1text.s":177, 0x100014b8]
(dbx)

[o3k]:~ $ CC -version
MIPSpro Compilers: Version 7.4.4m

which is the latest version of the mips pro compiler.

regards
Joerg
[8 May 2007 16:41] Magnus Blåudd
Forgot to add compile instructions. Apparently "support for placement new" is disabled if you have exceptions turned on.

mysqldev@octane2:~/my50-bug25285-octane2/tests> CC   -no_exceptions -64 test_new_delete.cc -o bug25285
[8 May 2007 16:45] Magnus Blåudd
I was hoping to be able to patch this by not having an empty destructor on the base class. But it seems that not even the class you wantr to array delet can have a destructor. :(
[13 May 2007 9:28] Joerg Behrens
Now i got something different.... A SIGBUS instead the SIGABRT.

CC   -no_exceptions -64 test_new_delete.cc -o bug25285

./bug25285
Base::new[](26)
Bus error (core dumped)

[o3k]:~ $ dbx ./bug25285 core
dbx version 7.3.7 (96015_Nov16 MR) Nov 16 2004 07:34:16
Core from signal SIGBUS: Bus error
(dbx) where
>  0 ::__array_new_general2(0x100076f4, 0xa, 0x1, 0x10002d70, 0x1540, 0x0, 0xffffffffffffffff, 0xb30) ["/j7/mtibuild/v744/workarea/v7.4.4m/libC/lang_support/vec_newdel.cxx":285, 0x2b7b500]
   1 ::__array_new2(0x4a, 0x10, 0xfffffffffffffff5, 0x100096c8, 0x1540, 0x0, 0xffffffffffffffff, 0xb30) ["/j7/mtibuild/v744/workarea/v7.4.4m/libC/lang_support/vec_newdel.cxx":527, 0x2b7b60c]
   2 ::main(0x4a, 0xfffffff3788, 0xfffffffffffffff5, 0x100096c8, 0x1540, 0x0, 0xffffffffffffffff, 0xb30) ["/usr/people/beh/test_new_delete.cc":192, 0x100021d4]
   3 __start() ["/xlv55/kudzu-apr12/work/irix/lib/libc/libc_64_M4/csu/crt1text.s":177, 0x100018e8]

regards
Joerg
[14 May 2007 11:11] Magnus Blåudd
Modified test with memory aligned to 64 bit boundary

Attachment: test_new_delete.cc (text/x-c++src), 4.57 KiB.

[14 May 2007 11:14] Magnus Blåudd
Sorry about that, it's caused by memory not being properly aligned.

There is a new version of the program that has been tested sucessfully on 64-bit Solaris, like this:
mysqldev@sol10-sparc-a:~/users/magnus> CC -xtarget=ultra -xarch=v9 /users/msvensson/test_new_delete.cc -o bug25285

When running it on IRIX I get the same crash as before.

Trying to file this as a compiler bug with IRIX, I need to send a fax... What is a fax? :=)
[14 May 2007 11:15] Magnus Blåudd
s/IRIX/sgi/
[14 May 2007 11:26] Joerg Behrens
> When running it on IRIX I get the same crash
> as before.

I can confirm this als with the latest mips pro.

> Trying to file this as a compiler bug with IRIX,
> I need to send a fax... What is a fax? :=)

Dont forget to place a link to http://en.wikipedia.org/wiki/IRIX on your fax otherwise the sgi supporter dont know what youre talking about. :)

I'll start investigating about the #862202 now.

regards
Joerg
[14 May 2007 11:37] Joerg Behrens
The #862202 is also listed as a known problem in the release notes of mipspro 7.4.4.m. http://www.irixworld.net/cgi-bin/infosrch.cgi?cmd=getdoc&db=localrelnotes&locale=C&coll=06...
[18 May 2007 16:55] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/27012

ChangeSet@1.2493, 2007-05-18 18:54:58+02:00, msvensson@pilot.blaudden +6 -0
  Bug#25285 mysqld got signal 6 when try runing the ./mysql-test-run script
   - The problem is with "array delete" of "class scope placement
     new array" allocated memory in code compiled by "MIPSpro Compilers: Version 7.4.4m".
     That functionality is fortunately used sparsely in MySQL and thus a
     simple "comment it out" will function as a workaround. 
  
     Since MySQL allocate memory in memory pools which will be freed at
     end of query or statement this will not lead to any major memory leak. 
     The memory will be freed, it's just that the destructor will not be run. Theoretically that shouldn't matter unless the class has allocated some resource outside of the memory pool.
  
     This patch will not be pushed to the source repo of MySQL, the users
     of MySQL on IRIX should need to apply it themself.