Bug #42850 race condition in my_thr_init.c
Submitted: 13 Feb 2009 21:59 Modified: 18 Dec 2009 20:43
Reporter: Martin Kögler Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: C API (client library) Severity:S3 (Non-critical)
Version:5.0.X, 6.0.X OS:Linux
Assigned to: Magnus Blåudd CPU Architecture:Any

[13 Feb 2009 21:59] Martin Kögler
Description:
There is an race condition in the workaround for BUG#24507:

BUG#24507 changed my_thread_global_init, so that it  spawns a new thread, which only calls pthread_exit.

The problem is, that under some condition, its possible to unload libmysqlclient_r, before this thread finished. This will lead to a segfault.

How to repeat:
$cat t.c
#include <mysql/mysql.h>
#include <dlfcn.h>
int STDCALL (*my_server_init)(int argc, char **argv, char **groups);
void STDCALL (*my_server_end)(void);

int main()
{
void *h=dlopen("libmysqlclient_r.so.15",RTLD_NOW);
my_server_init=dlsym(h,"mysql_server_init");
my_server_end=dlsym(h,"mysql_server_end");
my_server_init(0,0,0);
my_server_end();
dlclose(h);
sleep(20);
}
$ gcc -o t t.c -ldl

To be able to recreate this race condition every time, you can slowdown the dummy thread:
diff -urNad mysql-dfsg-5.0-5.0.51a~/mysys/my_thr_init.c mysql-dfsg-5.0-5.0.51a/mysys/my_thr_init.c
--- mysql-dfsg-5.0-5.0.51a~/mysys/my_thr_init.c 2009-02-13 20:50:42.000000000 +0000
+++ mysql-dfsg-5.0-5.0.51a/mysys/my_thr_init.c  2009-02-13 20:53:51.000000000 +0000
@@ -58,6 +58,7 @@
 nptl_pthread_exit_hack_handler(void *arg __attribute__((unused)))
 {
   /* Do nothing! */
+  sleep(10);
   pthread_exit(0);
   return 0;
 }
[19 Feb 2009 9:27] Sveta Smirnova
Thank you for the report.

Verified as described.

Backtrace:

(gdb) bt full
#0  0x00129153 in _Unwind_DeleteException () from /lib/libgcc_s.so.1
No symbol table info available.
#1  0x00129eb2 in _Unwind_Backtrace () from /lib/libgcc_s.so.1
No symbol table info available.
#2  0x00129f45 in _Unwind_ForcedUnwind () from /lib/libgcc_s.so.1
No symbol table info available.
#3  0x00465cfe in _Unwind_ForcedUnwind () from /lib/libpthread.so.0
No symbol table info available.
#4  0x00463da7 in __pthread_unwind () from /lib/libpthread.so.0
No symbol table info available.
#5  0x0045fee4 in pthread_exit () from /lib/libpthread.so.0
No symbol table info available.
#6  0x003c369c in pthread_exit () from /lib/libc.so.6
No symbol table info available.
#7  0x004c9aba in ?? ()
No symbol table info available.
#8  0x00000000 in ?? ()
No symbol table info available.
[8 Sep 2009 14:50] Kir Kolyshkin
Could the severity of this bug be raised? Apparently it hurts many people using php (with mysql support compiled in, bug is pretty easy to trigger), here are a few upstream bug reports:

OpenVZ: http://bugzilla.openvz.org/show_bug.cgi?id=1012
Debian: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=515143
Ubuntu: https://bugs.launchpad.net/ubuntu/+source/php5/+bug/392521 (and a handful of dups: 333504, 352648, 409876, 412501).
[8 Sep 2009 15:34] Kir Kolyshkin
> Debian: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=515143

And a few more Debian bugs:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=513204
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=524366
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=493045
[18 Sep 2009 19:38] Jim Winstead
Here are two possible solutions:

The first option is remove the workaround added as the fix for bug #24507, which is only apparently necessary for glibc < 2.4. This would re-introduce bug #24507 for some platforms such as RHEL4. It's not known how widespread of a problem that bug really was, since there are no support issues noted and it was apparently only uncovered in our own testing.

The second option is to expand the runtime-check for this workaround to only enable it when it is detected that a version of glibc <= 2.4 is being used with NPTL. (Right now we only check if NPTL is being used, not the glibc version.) This would leave the workaround in place for older versions of glibc but should avoid problems with more recent (since 2007 or so) distributions of Linux.

I'm not sure if this bug occurs on systems with glibc <= 2.4. If it does, removing the workaround is probably the best option, since that would lead me to believe that the bug caused by the workaround is more pervasive than the original problem.
[23 Sep 2009 12:42] Magnus Blåudd
Problem occurs frequently when running "mysql" for a short time on fast machines.
[23 Sep 2009 15:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/84400
[24 Sep 2009 6:30] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/84446
[24 Sep 2009 13:02] Jonas Oreland
possibly should the code be wrapped in
#if defined PTHREAD_CREATE_JOINABLE && HAVE_PTHREAD_JOIN
(and a configure check for pthread_join if not already present)

otherwise ok!!
[24 Sep 2009 13:20] Magnus Blåudd
Jonas approves ok after realizing it's inside TARGET_OS_LINUX and thus we should always have joinable threads.
[28 Sep 2009 12:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/84839
[28 Sep 2009 12:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/84840
[28 Sep 2009 12:42] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/84841
[28 Sep 2009 12:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/84842
[28 Sep 2009 12:49] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/84846
[28 Sep 2009 13:12] Magnus Blåudd
Pushed to 5.0-bugteam, 5.1-bugteam and pe
[28 Sep 2009 13:27] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/84864
[28 Sep 2009 13:45] Magnus Blåudd
Pushed to 6.2, 6.3, 7.0 and 7.1
[28 Sep 2009 14:47] Jon Stephens
Documented as follows in the NDB-6.2.19, 6.3.27, and 7.0.8 changelogs, as follows:

        The fix for Bug #24507 led to client application failures in some cases,
        and was reverted.

Set to status=NDI, category=server; waiting for pushes in 5.0, 5.1-main trees.
[30 Sep 2009 8:13] Bugs System
Pushed into 5.1.37-ndb-6.2.19 (revid:frazer@mysql.com-20090929142503-sst6g3fs0vx9fgil) (version source revid:magnus.blaudd@sun.com-20090928124814-0dz4upft34b8sxfo) (merge vers: 5.1.37-ndb-6.2.19) (pib:11)
[30 Sep 2009 8:13] Bugs System
Pushed into 5.1.37-ndb-6.3.28 (revid:jonas@mysql.com-20090930070741-13u316s7s2l7e1ej) (version source revid:magnus.blaudd@sun.com-20090928132630-l2wqa13bt8feivpa) (merge vers: 5.1.37-ndb-6.3.27) (pib:11)
[30 Sep 2009 8:14] Bugs System
Pushed into 5.1.37-ndb-7.0.9 (revid:jonas@mysql.com-20090930075942-1q6asjcp0gaeynmj) (version source revid:jonas@mysql.com-20090929131318-kvh036r1jred8h94) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[30 Sep 2009 8:15] Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:jonas@mysql.com-20090930080049-1c8a8cio9qgvhq35) (version source revid:magnus.blaudd@sun.com-20090928133401-p3a632m95xlxjr6p) (merge vers: 5.1.35-ndb-7.1.0) (pib:11)
[30 Sep 2009 9:17] Jon Stephens
Set back to NDI, still waiting for pushes to main.
[30 Sep 2009 9:21] Jon Stephens
Verified with Magnus on IRC that the fix went into 6.3.27/7.0.8 release clones.
[30 Sep 2009 10:38] Jon Stephens
Per IRC discussion with Magnus, changed changelog entry to read:

        The fix for Bug #24507 could lead in some cases to client
        application failures due to a race condition. Now the server waits for
        the "dummy" thread to return before exiting, thus making
        sure that only one thread can initialize the pthread
        library.

Left status as set previously.
[6 Oct 2009 8:57] Bugs System
Pushed into 5.0.87 (revid:joro@sun.com-20091006073202-rj21ggvo2gw032ks) (version source revid:kristofer.pettersson@sun.com-20090929151855-gvpblm4dnnubypdv) (merge vers: 5.0.87) (pib:11)
[6 Oct 2009 8:59] Bugs System
Pushed into 5.1.40 (revid:joro@sun.com-20091006073316-lea2cpijh9r6on7c) (version source revid:joro@sun.com-20090928134840-3zydutswe3lrr47n) (merge vers: 5.1.40) (pib:11)
[6 Oct 2009 12:08] Jon Stephens
Also documented bugfix in the 5.0.87 and 5.1.40 changelogs.

Returned to NDI status, waiting for push to 5.4.
[22 Oct 2009 6:34] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091022063126-l0qzirh9xyhp0bpc) (version source revid:alik@sun.com-20091019135554-s1pvptt6i750lfhv) (merge vers: 6.0.14-alpha) (pib:13)
[22 Oct 2009 7:06] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091022060553-znkmxm0g0gm6ckvw) (version source revid:alik@sun.com-20091013094238-g67x6tgdm9a7uik0) (merge vers: 5.5.0-beta) (pib:13)
[22 Oct 2009 19:23] Paul DuBois
Noted in 5.5.0, 6.0.14 changelogs.
[18 Dec 2009 10:30] Bugs System
Pushed into 5.1.41-ndb-7.1.0 (revid:jonas@mysql.com-20091218102229-64tk47xonu3dv6r6) (version source revid:jonas@mysql.com-20091218095730-26gwjidfsdw45dto) (merge vers: 5.1.41-ndb-7.1.0) (pib:15)
[18 Dec 2009 10:46] Bugs System
Pushed into 5.1.41-ndb-6.2.19 (revid:jonas@mysql.com-20091218100224-vtzr0fahhsuhjsmt) (version source revid:jonas@mysql.com-20091217101452-qwzyaig50w74xmye) (merge vers: 5.1.41-ndb-6.2.19) (pib:15)
[18 Dec 2009 11:01] Bugs System
Pushed into 5.1.41-ndb-6.3.31 (revid:jonas@mysql.com-20091218100616-75d9tek96o6ob6k0) (version source revid:jonas@mysql.com-20091217154335-290no45qdins5bwo) (merge vers: 5.1.41-ndb-6.3.31) (pib:15)
[18 Dec 2009 11:15] Bugs System
Pushed into 5.1.41-ndb-7.0.11 (revid:jonas@mysql.com-20091218101303-ga32mrnr15jsa606) (version source revid:jonas@mysql.com-20091218064304-ezreonykd9f4kelk) (merge vers: 5.1.41-ndb-7.0.11) (pib:15)