Bug #45630 rpl_trigger.test causes valgrind failures within nptl_pthread_exit_hack_handler
Submitted: 20 Jun 2009 7:26 Modified: 12 Aug 2009 5:13
Reporter: Sergey Petrunya Email Updates:
Status: Closed Impact on me:
None 
Category:Tests: Replication Severity:S3 (Non-critical)
Version:5.1 OS:Any (Ubuntu 9.04 x86)
Assigned to: Daogang Qu CPU Architecture:Any
Tags: Contribution
Triage: Triaged: D2 (Serious)

[20 Jun 2009 7:26] Sergey Petrunya
Description:
The test fails as follows: 

rpl.rpl_trigger 'stmt'                   [ fail ]  Found warnings/errors in server log file!
        Test ended at 2009-06-19 06:57:55
mysqltest: Logging to ''.
mysqltest: Results saved in ''.
mysqltest: Connecting to server localhost:11000 (socket /var/lib/buildbot/maria-slave/psergey-pylon-x86/build/mysql-test/var/tmp/mysqld.1.sock) as 'root', connection 'default', attempt 0 ...
mysqltest: ... Connected.
mysqltest: Start processing test commands from './include/check-warnings.test' ...
file_name	line
/var/lib/buildbot/maria-slave/psergey-pylon-x86/build/mysql-test/var/log/mysqld.1.err	==13722==    at 0x7BFBFDE: malloc (vg_replace_malloc.c:207)
/var/lib/buildbot/maria-slave/psergey-pylon-x86/build/mysql-test/var/log/mysqld.1.err	==13722==    at 0x7BFBFDE: malloc (vg_replace_malloc.c:207)
/var/lib/buildbot/maria-slave/psergey-pylon-x86/build/mysql-test/var/log/mysqld.1.err	==13722==    at 0x7BFBFDE: malloc (vg_replace_malloc.c:207)
/var/lib/buildbot/maria-slave/psergey-pylon-x86/build/mysql-test/var/log/mysqld.1.err	==13722==    at 0x7BFA092: calloc (vg_replace_malloc.c:397)
/var/lib/buildbot/maria-slave/psergey-pylon-x86/build/mysql-test/var/log/mysqld.1.err	==13722==    at 0x7BFA092: calloc (vg_replace_malloc.c:397)
^ Found warnings!!
mysqltest: ... Done processing test commands.

How to repeat:
The problem occurs on Ubuntu 9.04, x86 platform (but not amd64).

* Compile with ./BUILD/compile-valgrind-max
* Run mysql-test-run t/rpl_trigger.test
It will generate a load of failures, all similar to the following:

var/log/mysqld.1.err:==30768== 19 bytes in 1 blocks are still reachable in loss record 1 of 6
var/log/mysqld.1.err:==30768==    at 0x7BF3FDE: malloc (vg_replace_malloc.c:207)
var/log/mysqld.1.err:==30768==    by 0x440A251: (within /lib/ld-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x4405CF2: (within /lib/ld-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x440766C: (within /lib/ld-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x4412216: (within /lib/ld-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x440E035: (within /lib/ld-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x4411C1D: (within /lib/ld-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x7E00901: (within /lib/tls/i686/cmov/libc-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x440E035: (within /lib/ld-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x7E00AC4: __libc_dlopen_mode (in /lib/tls/i686/cmov/libc-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x7C21E16: pthread_cancel_init (in /lib/tls/i686/cmov/libpthread-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x7C21F40: _Unwind_ForcedUnwind (in /lib/tls/i686/cmov/libpthread-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x7C1F960: __pthread_unwind (in /lib/tls/i686/cmov/libpthread-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x7C19A5F: pthread_exit (in /lib/tls/i686/cmov/libpthread-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x8720109: nptl_pthread_exit_hack_handler (my_thr_init.c:61)
var/log/mysqld.1.err:==30768==    by 0x7C194FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so)
var/log/mysqld.1.err:==30768== 
var/log/mysqld.1.err:==30768== 
var/log/mysqld.1.err:==30768== 19 bytes in 1 blocks are still reachable in loss record 2 of 6
var/log/mysqld.1.err:==30768==    at 0x7BF3FDE: malloc (vg_replace_malloc.c:207)
var/log/mysqld.1.err:==30768==    by 0x4404ABE: (within /lib/ld-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x44078FB: (within /lib/ld-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x4412216: (within /lib/ld-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x440E035: (within /lib/ld-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x4411C1D: (within /lib/ld-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x7E00901: (within /lib/tls/i686/cmov/libc-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x440E035: (within /lib/ld-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x7E00AC4: __libc_dlopen_mode (in /lib/tls/i686/cmov/libc-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x7C21E16: pthread_cancel_init (in /lib/tls/i686/cmov/libpthread-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x7C21F40: _Unwind_ForcedUnwind (in /lib/tls/i686/cmov/libpthread-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x7C1F960: __pthread_unwind (in /lib/tls/i686/cmov/libpthread-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x7C19A5F: pthread_exit (in /lib/tls/i686/cmov/libpthread-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x8720109: nptl_pthread_exit_hack_handler (my_thr_init.c:61)
var/log/mysqld.1.err:==30768==    by 0x7C194FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so)
var/log/mysqld.1.err:==30768==    by 0x7DC449D: clone (in /lib/tls/i686/cmov/libc-2.9.so)

Suggested fix:
https://lists.launchpad.net/maria-developers/msg00406.html

(note that a shorter version of
https://lists.launchpad.net/maria-developers/msg00403.html
will not work with valgrind-3.3)
[20 Jun 2009 8:28] Sveta Smirnova
Thank you for the report.

Verified as described.
[22 Jun 2009 23:26] Davi Arnaut
Perhaps this one is really a libc bug? There is a somewhat related bug report at https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/367937
[23 Jun 2009 6:06] Jonas Oreland
FYI: I added the following suppression to the telco-trees
valgrind-3.4
GNU C Library stable release version 2.6.1, by Roland McGrath et al.
[23 Jun 2009 6:06] Jonas Oreland
patch

Attachment: valgrind.supp.patch (text/x-patch), 1.60 KiB.

[23 Jun 2009 7:13] Sveta Smirnova
Davi, I tested it on Red Hat
[24 Jun 2009 2:59] Davi Arnaut
Just for reference: https://bugzilla.redhat.com/show_bug.cgi?id=483821
[24 Jun 2009 3:28] Davi Arnaut
Another reference: http://bugs.kde.org/show_bug.cgi?id=151612

Support for frame-level wildcards in suppressions was added in valgrind 3.1.1, released in 3.4.0
[24 Jun 2009 6:49] Sergey Petrunya
Davi, 

I'm looking at the valgrind changelog here:
http://valgrind.org/docs/manual/dist.news.html and there is no mention of frame-level wildcards for version 3.1.1, while there is a mention of it for version 3.4.0 (released 2nd January 2009). 

I don't think it's reasonable to expect the build/test host owners to apply some patches that never went into the released version of valgrind. 

You have an option of not supporting valgrind on platforms that are older than 6 months, sure.
[24 Jun 2009 22:05] Davi Arnaut
Sergey,

It's just a reference so that someone later can understand why we didn't use frame-level wildcards. Also a reminder that once valgrind 3.3 and earlier aren't in mainstream use, we can resort back to using wildcards.

It's important to properly document technical decisions.

I took the version 3.1.1 from the referenced link, it says "Version:	3.1.1" right on top.
[24 Jun 2009 22:07] Davi Arnaut
And paying attention now, it's obviously not the version where it was fixed. Sorry.
[1 Jul 2009 20:25] Kristofer Pettersson
I was unable to repeat this bug on 5.1-bugteam on either macosx or ubuntu (libc version 2.6.1).

Sveta, how and where did you verify this bug?
[2 Jul 2009 8:40] Sveta Smirnova
Kristofer,

as discussed on Red Hat x86 Linux with glibc 2.3.6
[2 Jul 2009 18:31] Sveta Smirnova
Kristofer,

I am sorry to be not clear: mtr should be run with option --valgrind:

./mtr --suite=rpl rpl_trigger --valgrind

Problem is also repeatable on x64 box:

$uname -a
Linux host.mysql.com 2.6.18-8.1.1.el5 #1 SMP Mon Feb 26 20:37:57 EST 2007 x86_64 x86_64 x86_64 GNU/Linux

$getconf GNU_LIBC_VERSION
glibc 2.5
[2 Jul 2009 19:34] Kristofer Pettersson
Sveta, I'm sorry, I've must have blown a brain fuse or something. Completely repeatable using --valgrind option.
[22 Jul 2009 5:06] Daogang Qu
The following server shutdown and start code causes the valgrind failures in rpl_trigger.test.
# Stop master server
#--echo --> Stop master server
--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
wait
EOF
--shutdown_server 10
--source include/wait_until_disconnected.inc
# Replace binlog
remove_file $MYSQLD_DATADIR/master-bin.000001;
copy_file $MYSQL_TEST_DIR/std_data/bug16266.000001 $MYSQLD_DATADIR/master-bin.000001;

#--echo --> Start master server
--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
restart
EOF
--enable_reconnect
--source include/wait_until_connected_again.inc

I think we should suppress the valgrind failures. Because the server shutdown and stop code was heavily used in mysql test set. Though the "rpl_trigger" test work fine even if the server shutdown and start code has been removed.
[22 Jul 2009 7:51] Daogang Qu
I don't think we should add a test file or append the "./mtr rpl_trigger --valgrind" test command to existing test files for the bug. what do you think?

We can see that the valgrind failures are suppressed from the following test info:
daogangq@daogangq-laptop:~/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test$ ./mtr rpl_trigger --valgrind
Logging: ./mtr  rpl_trigger --valgrind
090722 15:34:39 [Note] Plugin 'FEDERATED' is disabled.
090722 15:34:39 [Note] Plugin 'ndbcluster' is disabled.
090722 15:34:39 [Warning] Forcing shutdown of 2 plugins
MySQL Version 5.1.38
Turning on valgrind for all executables
Running valgrind with options " --show-reachable=yes "
Checking supported features...
 - using ndbcluster when necessary, mysqld supports it
 - SSL connections supported
 - binaries are debug compiled
Using "../libtool" when running valgrind or debugger
Collecting tests...
 - adding combinations for rpl
vardir: /home/daogangq/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test/var
Checking leftover processes...
 - found old pid 18348 in 'mysqld.1.pid', killing it...
   process did not exist!
 - found old pid 18114 in 'mysqld.2.pid', killing it...
   process did not exist!
Removing old var directory...
Creating var directory '/home/daogangq/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test/var'...
Installing system database...
Using server port 51524
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009

==============================================================================

TEST                                      RESULT   TIME (ms)
------------------------------------------------------------

rpl.rpl_trigger 'row'                    [ skipped ]  Doesn't support --binlog-format='row'
rpl.rpl_trigger 'stmt'                   [ pass ]  82113
rpl.rpl_trigger 'mix'                    [ pass ]  89969
------------------------------------------------------------
The servers were restarted 1 times
Spent 172.082 of 308 seconds executing testcases

All 2 tests were successful.

daogangq@daogangq-laptop:~/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test$ 

The original output: 

daogangq@daogangq-laptop:~/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test$ ./mtr rpl_trigger --valgrind
Logging: ./mtr  rpl_trigger --valgrind
090722 14:30:51 [Note] Plugin 'FEDERATED' is disabled.
090722 14:30:51 [Note] Plugin 'ndbcluster' is disabled.
090722 14:30:51 [Warning] Forcing shutdown of 2 plugins
MySQL Version 5.1.38
Turning on valgrind for all executables
Running valgrind with options " --show-reachable=yes "
Checking supported features...
 - using ndbcluster when necessary, mysqld supports it
 - SSL connections supported
 - binaries are debug compiled
Using "../libtool" when running valgrind or debugger
Collecting tests...
 - adding combinations for rpl
vardir: /home/daogangq/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/daogangq/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test/var'...
Installing system database...
Using server port 33821
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009

==============================================================================

TEST                                      RESULT   TIME (ms)
------------------------------------------------------------

rpl.rpl_trigger 'row'                    [ skipped ]  Doesn't support --binlog-format='row'
rpl.rpl_trigger 'stmt'                   [ fail ]  Found warnings/errors in server log file!
        Test ended at 2009-07-22 14:33:23
mysqltest: Logging to ''.
mysqltest: Results saved in ''.
mysqltest: Connecting to server localhost:13000 (socket /home/daogangq/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test/var/tmp/mysqld.1.sock) as 'root', connection 'default', attempt 0 ...
mysqltest: ... Connected.
mysqltest: Start processing test commands from './include/check-warnings.test' ...
file_name	line
/home/daogangq/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test/var/log/mysqld.1.err	==15273==    at 0x4026FDE: malloc (vg_replace_malloc.c:207)
/home/daogangq/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test/var/log/mysqld.1.err	==15273==    at 0x4026FDE: malloc (vg_replace_malloc.c:207)
/home/daogangq/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test/var/log/mysqld.1.err	==15273==    at 0x4026FDE: malloc (vg_replace_malloc.c:207)
/home/daogangq/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test/var/log/mysqld.1.err	==15273==    at 0x4025092: calloc (vg_replace_malloc.c:397)
/home/daogangq/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test/var/log/mysqld.1.err	==15273==    at 0x4025092: calloc (vg_replace_malloc.c:397)
^ Found warnings!!
mysqltest: ... Done processing test commands.
mysqltest: Test has succeeded!
ok

 - saving '/home/daogangq/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test/var/log/rpl.rpl_trigger-stmt/' to '/home/daogangq/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test/var/log/rpl.rpl_trigger-stmt/'

Only  2  of 3 completed.
mysql-test-run: *** ERROR: Not all tests completed
daogangq@daogangq-laptop:~/mysql/bzrwork/bug45630/mysql-5.1-bugteam/mysql-test$
[23 Jul 2009 11:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/79144

3034 Dao-Gang.Qu@sun.com	2009-07-22
      Bug #45630  	rpl_trigger.test causes valgrind failures within nptl_pthread_exit_hack_handler
      
      The server shutdown and start code triggered the valgrind failures 
      within nptl_pthread_exit_hack_handler on Ubuntu 9.04, x86 (but not amd64) 
      in rpl_trigger.test file.
      
      For fixing the bug, suppress valgrind failures within nptl_pthread_exit_hack_handler 
      on Ubuntu 9.04, x86 (but not amd64). Because the server shutdown and start
      code has been heavily used in mysql test set.
     @ mysql-test/valgrind.supp
        Add code for suppressing valgrind failures within nptl_pthread_exit_hack_handler on Ubuntu 9.04, x86 (but not amd64).
[6 Aug 2009 3:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/80236

3056 Dao-Gang.Qu@sun.com	2009-08-06
      Bug #45630  	rpl_trigger.test causes valgrind failures within nptl_pthread_exit_hack_handler
      
      The server shutdown and start code triggered the valgrind failures 
      within nptl_pthread_exit_hack_handler on Ubuntu 9.04, x86 (but not amd64) 
      in rpl_trigger.test file.
      
      For fixing the bug, suppress valgrind failures within nptl_pthread_exit_hack_handler 
      on Ubuntu 9.04, x86 (but not amd64). Because the server shutdown and start
      code has been heavily used in mysql test set.
     @ mysql-test/valgrind.supp
        Add code for suppressing valgrind failures within nptl_pthread_exit_hack_handler on Ubuntu 9.04, x86 (but not amd64).
[6 Aug 2009 5:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/80239

3519 Dao-Gang.Qu@sun.com	2009-08-06 [merge]
      Bug #45630  	rpl_trigger.test causes valgrind failures within nptl_pthread_exit_hack_handler
      
      The server shutdown and start code triggered the valgrind failures 
      within nptl_pthread_exit_hack_handler on Ubuntu 9.04, x86 (but not amd64) 
      in rpl_trigger.test file.
      
      For fixing the bug, suppress valgrind failures within nptl_pthread_exit_hack_handler 
      on Ubuntu 9.04, x86 (but not amd64). Because the server shutdown and start
      code has been heavily used in mysql test set.
     @ mysql-test/valgrind.supp
        Add code for suppressing valgrind failures within nptl_pthread_exit_hack_handler on Ubuntu 9.04, x86 (but not amd64).
[11 Aug 2009 14:32] Bugs System
Pushed into 5.1.38 (revid:davi.arnaut@sun.com-20090811142907-uu7nckpe93pii81v) (version source revid:davi.arnaut@sun.com-20090811142907-uu7nckpe93pii81v) (merge vers: 5.1.38) (pib:11)
[12 Aug 2009 5:13] Jon Stephens
Test failure; no end-user changes to document; closed.
[14 Sep 2009 16:06] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090914155317-m1g9wodmndzdj4l1) (version source revid:alik@sun.com-20090914155317-m1g9wodmndzdj4l1) (merge vers: 5.4.4-alpha) (pib:11)
[1 Oct 2009 5:59] Bugs System
Pushed into 5.1.39-ndb-6.3.28 (revid:jonas@mysql.com-20091001055605-ap2kiaarr7p40mmv) (version source revid:jonas@mysql.com-20091001055605-ap2kiaarr7p40mmv) (merge vers: 5.1.39-ndb-6.3.28) (pib:11)
[1 Oct 2009 7:25] Bugs System
Pushed into 5.1.39-ndb-7.0.9 (revid:jonas@mysql.com-20091001072547-kv17uu06hfjhgjay) (version source revid:jonas@mysql.com-20091001071652-irejtnumzbpsbgk2) (merge vers: 5.1.39-ndb-7.0.9) (pib:11)
[1 Oct 2009 13:25] Bugs System
Pushed into 5.1.39-ndb-7.1.0 (revid:jonas@mysql.com-20091001123013-g9ob2tsyctpw6zs0) (version source revid:jonas@mysql.com-20091001123013-g9ob2tsyctpw6zs0) (merge vers: 5.1.39-ndb-7.1.0) (pib:11)
[5 Oct 2009 10:50] Bugs System
Pushed into 5.1.39-ndb-6.2.19 (revid:jonas@mysql.com-20091005103850-dwij2dojwpvf5hi6) (version source revid:jonas@mysql.com-20090930185117-bhud4ek1y0hsj1nv) (merge vers: 5.1.39-ndb-6.2.19) (pib:11)