Bug #43001 maria.maria-no-logging fails sporadically on valgrind in PS mode
Submitted: 19 Feb 2009 7:34 Modified: 18 Mar 2009 12:49
Reporter: Alexander Nozdrin Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S3 (Non-critical)
Version:6.0-TRUNK OS:Any
Assigned to: Kevin Lewis CPU Architecture:Any
Tags: F_STARTUP, pushbuild, sporadic, test failure

[19 Feb 2009 7:34] Alexander Nozdrin
Description:
Symptoms:
-------------------------------------------------------------
CURRENT_TEST: maria.maria-no-logging
==22549== Memcheck, a memory error detector.
==22549== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==22549== Using LibVEX rev 1732, a library for dynamic binary translation.
==22549== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==22549== Using valgrind-3.2.3, a dynamic binary instrumentation framework.
==22549== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==22549== For more details, rerun with: -v
==22549== 
==22549== Invalid read of size 4
==22549==    at 0x53371C9: mysql_stmt_errno (libmysql.c:4740)
==22549==    by 0x4149E8: run_query_stmt(st_mysql*, st_command*, char*, int, st_dynamic_string*, st_dynamic_string*) (mysqltest.cc:6925)
==22549==    by 0x414E82: run_query(st_connection*, st_command*, int) (mysqltest.cc:7146)
==22549==    by 0x416556: main (mysqltest.cc:7777)
==22549==  Address 0x620B89C is 268 bytes inside a block of size 816 free'd
==22549==    at 0x4A1F66B: free (vg_replace_malloc.c:233)
==22549==    by 0x533EC42: my_no_flags_free (my_malloc.c:59)
==22549==    by 0x53391AC: mysql_stmt_close (libmysql.c:4708)
==22549==    by 0x4149C6: run_query_stmt(st_mysql*, st_command*, char*, int, st_dynamic_string*, st_dynamic_string*) (mysqltest.cc:6915)
==22549==    by 0x414E82: run_query(st_connection*, st_command*, int) (mysqltest.cc:7146)
==22549==    by 0x416556: main (mysqltest.cc:7777)
{
   <insert a suppression name here>
   Memcheck:Addr4
   fun:mysql_stmt_errno
   fun:_Z14run_query_stmtP8st_mysqlP10st_commandPciP17st_dynamic_stringS5_
   fun:_Z9run_queryP13st_connectionP10st_commandi
   fun:main
}
mysqltest: In included file "./include/wait_until_connected_again.inc": At line 30: Server failed to restart

The result from queries just before the failure was:
< snip >
insert into t2 select * from t2;
insert into t2 select * from t2;
insert into t2 select * from t2;
insert into t2 select * from t2;
* shut down mysqld, removed logs, restarted it
create table t1 (a varchar(100)) engine=maria transactional=1;
show create table t1;
Table	Create Table
t1	CREATE TABLE `t1` (
  `a` varchar(100) DEFAULT NULL
) ENGINE=MARIA DEFAULT CHARSET=latin1 PAGE_CHECKSUM=1 TRANSACTIONAL=1
show engine maria logs;
Type	Name	Status
MARIA	Size        16384 maria_log.00000001	unknown
insert into t1 values('a');
insert into t1 select * from t2;
show engine maria logs;
Type	Name	Status
MARIA	Size        24576 maria_log.00000001	unknown
* shut down mysqld, removed logs, restarted it

More results from queries before failure can be found in /dev/shm/var-ps_stm-121/log/maria-no-logging.log
==22549== 
==22549== ERROR SUMMARY: 11 errors from 1 contexts (suppressed: 3 from 1)
==22549== malloc/free: in use at exit: 344 bytes in 2 blocks.
==22549== malloc/free: 28,017 allocs, 28,015 frees, 77,910,520 bytes allocated.
==22549== For counts of detected errors, rerun with: -v
==22549== searching for pointers to 2 not-freed blocks.
==22549== checked 9,848,576 bytes.
==22549== 
==22549== LEAK SUMMARY:
==22549==    definitely lost: 0 bytes in 0 blocks.
==22549==      possibly lost: 0 bytes in 0 blocks.
==22549==    still reachable: 0 bytes in 0 blocks.
==22549==         suppressed: 344 bytes in 2 blocks.
-------------------------------------------------------------

The bug reported after a failure on 2009-02-18 in 6.0-bugteam
(http://tinyurl.com/asyyqa).

How to repeat:
XRef: http://tinyurl.com/ce6zn3
[16 Mar 2009 9:17] Guilhem Bichot
Bug is probably in mysqltest.cc: according to Valgrind, when it's saving the error, it saves a wrong value. The test, which wants to detect that reconnection has happened, tests the error...
[16 Mar 2009 10:30] Guilhem Bichot
queued to 5.1-maria and 6.0-maria
[17 Mar 2009 13:05] Guilhem Bichot
So, even though the patch fixed the Valgrind error in mysqltest.cc, which could be a cause of wrong restart detection, the problem still happens after the push of this patch.
Looking at the log of the server during the failed run of maria-no-logging.test, extracted from pushbuild1's archives (see URL in next post), we see that, when the test shut down and restarted mysqld, Falcon failed to start. Here's the decomposition (remember that the test shuts down and restarts mysqld 4 times in order to verify working behaviour when Maria logs are manually deleted):

CURRENT_TEST: maria.maria-no-logging

# here a first shutdown

090317  1:29:15 [Note] /data0/pushbuild/pb3/pb/bzr_mysql-6.0-maria/84/mysql-6.0.11-alpha-pb84/sql/mysqld: Normal shutdown
...
090317  1:29:18 [Note] /data0/pushbuild/pb3/pb/bzr_mysql-6.0-maria/84/mysql-6.0.11-alpha-pb84/sql/mysqld: Shutdown complete
...

# here a first restart

==16080== Memcheck, a memory error detector.
==16080== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==16080== Using LibVEX rev 1732, a library for dynamic binary translation.
==16080== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==16080== Using valgrind-3.2.3, a dynamic binary instrumentation framework.
==16080== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==16080== For more details, rerun with: -v
==16080== 
...
090317  1:29:31 [Note] /data0/pushbuild/pb3/pb/bzr_mysql-6.0-maria/84/mysql-6.0.11-alpha-pb84/sql/mysqld: ready for connections.
Version: '6.0.11-alpha-pb84-debug-log'  socket: '/dev/shm/pbtmp-ps_stm-121/mysqld.1.sock'  port: 11210  MySQL Pushbuild Edition, build 84

# here a second shutdown

090317  1:29:35 [Note] /data0/pushbuild/pb3/pb/bzr_mysql-6.0-maria/84/mysql-6.0.11-alpha-pb84/sql/mysqld: Normal shutdown
...
090317  1:29:39 [Note] /data0/pushbuild/pb3/pb/bzr_mysql-6.0-maria/84/mysql-6.0.11-alpha-pb84/sql/mysqld: Shutdown complete
...

# here a second restart

...
090317  1:29:48 [Note] /data0/pushbuild/pb3/pb/bzr_mysql-6.0-maria/84/mysql-6.0.11-alpha-pb84/sql/mysqld: ready for connections.
Version: '6.0.11-alpha-pb84-debug-log'  socket: '/dev/shm/pbtmp-ps_stm-121/mysqld.1.sock'  port: 11210  MySQL Pushbuild Edition, build 84

# here a third shutdown

090317  1:29:50 [Note] /data0/pushbuild/pb3/pb/bzr_mysql-6.0-maria/84/mysql-6.0.11-alpha-pb84/sql/mysqld: Normal shutdown
...
090317  1:29:53 [Note] /data0/pushbuild/pb3/pb/bzr_mysql-6.0-maria/84/mysql-6.0.11-alpha-pb84/sql/mysqld: Shutdown complete
...

# here a third restart

...
090317  1:29:57 [Note] Plugin 'InnoDB' disabled by command line option

# and this restart fails with segfault in Falcon code:

==17535== Thread 6:
==17535== Invalid read of size 8
==17535==    at 0x94F3FB: Sync::lock(LockType) (Sync.cpp:58)
==17535==    by 0x95B599: TableSpaceManager::reportStatistics() (TableSpaceManager.cpp:305)
==17535==    by 0x982EAB: Database::scavenge(bool) (Database.cpp:1803)
==17535==    by 0x98510F: Database::scavengerThreadMain(void*) (Database.cpp:1955)
==17535==    by 0x95E7A3: Thread::thread() (Thread.cpp:166)
==17535==    by 0x95E988: Thread::thread(void*) (Thread.cpp:145)
==17535==    by 0x4C32142: start_thread (in /lib64/libpthread-2.4.so)
==17535==    by 0x5268B8C: clone (in /lib64/libc-2.4.so)
==17535==  Address 0x668 is not stack'd, malloc'd or (recently) free'd
{
   <insert a suppression name here>
   Memcheck:Addr8
   fun:_ZN4Sync4lockE8LockType
   fun:_ZN17TableSpaceManager16reportStatisticsEv
   fun:_ZN8Database8scavengeEb
   fun:_ZN8Database19scavengerThreadMainEPv
   fun:_ZN6Thread6threadEv
   fun:_ZN6Thread6threadEPv
   fun:start_thread
   fun:clone
   obj:*
   obj:*
   obj:*
   obj:*
}
090317  1:30:00 - mysqld got signal 11 ;

Thus server crashes during third restart and test normally reports it.
I have also looked at another occurence of a failure of maria-no-logging, in the previous push, and the .err file contains the same Falcon stack trace.
Maybe I should set those Maria tests to run with --skip-falcon?
[17 Mar 2009 14:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/69438

2728 Guilhem Bichot	2009-03-17
      BUG#43001 "maria.maria-no-logging fails sporadically on valgrind in PS mode" still happens and this
      time the cause seems to be Falcon failing to start during one of the restarts done by the test
     @ mysql-test/suite/maria/t/maria-autozerofill-master.opt
        sometimes Falcon fails to start during one of the restarts done by this test, work around
     @ mysql-test/suite/maria/t/maria-no-logging-master.opt
        sometimes Falcon fails to start during one of the restarts done by this test, work around
     @ mysql-test/suite/maria/t/maria-recovery-bitmap.test
        this flooded the error log with uninteresting debug info
     @ mysql-test/suite/maria/t/maria-recovery2.test
        this flooded the error log with uninteresting debug info
[18 Mar 2009 12:49] Kevin Lewis
This bug, if it is all about the failure of Falcon startup, is the same as Bug#43200 and was fixed therin.

  http://lists.mysql.com/commits/67594

3044 Kevin Lewis	2009-02-25
      Bug #43200  	Falcon crash at startup in Database::scavenge()
      Check the existance of tableSpaceManager before trying to 
      reportStatistics from it.
[3 Apr 2009 14:52] Bugs System
Pushed into 6.0.11-alpha (revid:guilhem@mysql.com-20090402210815-lu17n4kj8c73cfe8) (version source revid:guilhem@mysql.com-20090317145345-mp4u4xb3g3ym1sal) (merge vers: 6.0.11-alpha) (pib:6)