Bug #65946 Sid_map::Sid_map calls DBUG which may have unitialized THR_KEY_mysys and thus c
Submitted: 19 Jul 2012 4:58 Modified: 31 Jul 2012 13:46
Reporter: Stewart Smith Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.6.5 OS:Linux
Assigned to: CPU Architecture:Any

[19 Jul 2012 4:58] Stewart Smith
Description:
F you are really lucky (or unlucky), the order in which things are
initialized on server startup goes a bit diferent than you'd expect. I
managed to get this to reproduce by always LD_PRELOADING
libeatmydata. This may also occur if LD_PRELOADING other libraries, but
I haven't tested.

Namely, we get the Sid_map::Sid_map constructor firing before
my_thread_global_init() is run. What happens here is that
Sid_map::Sid_map calls a DBUG function, which will try to get a thread
specific variable for THR_KEY_mysys, but the thread key hasn't yet been
initialized, so instead we get a wonderful pile of garbage.

The end result, a crash:

#0  0x0000000000b8e825 in _db_enter_ (
    _func_=<error reading variable: Cannot access memory at address 0xc>, 
    _file_=0xcbe490 "dno, const rpl_sid *)", _line_=27, 
    _stack_frame_=0x7fff2d95d1f0)
    at dbug/dbug.c:1112
#1  0x000000000084741b in Sid_map::Sid_map (this=0x140e2e0, 
    _sid_lock=<optimized out>)
    at sql/rpl_gtid_sid_map.cc:27
#2  0x0000000000847252 in __static_initialization_and_destruction_0 (
    __priority=65535, __initialize_p=1)
    at sql/rpl_gtid_misc.cc:34
#3  _GLOBAL__sub_I_rpl_gtid_misc.cc(void) ()
    at sql/rpl_gtid_misc.cc:174
#4  0x0000000000bf481d in __do_global_ctors_aux ()

Running under Valgrind gives you instead:

==3390== Invalid read of size 1
==3390==    at 0x636D1D2: internal_fnmatch (fnmatch_loop.c:69)
==3390==    by 0x636E9EC: fnmatch@@GLIBC_2.2.5 (fnmatch.c:460)
==3390==    by 0xB8D804: InList (dbug.c:1501)

followed by:

==3390==  General Protection Fault
==3390==    at 0x636D1D2: internal_fnmatch (fnmatch_loop.c:69)
==3390==    by 0x636E9EC: fnmatch@@GLIBC_2.2.5 (fnmatch.c:460)
==3390==    by 0xB8D804: InList (dbug.c:1501)
==3390==    by 0xB8D871: DoTrace (dbug.c:1670)
==3390==    by 0xB8E84C: _db_enter_ (dbug.c:1115)
==3390==    by 0x84741A: Sid_map::Sid_map(Checkable_rwlock*) (rpl_gtid_sid_map.cc:27)
==3390==    by 0x847251: _GLOBAL__sub_I_rpl_gtid_misc.cc (rpl_gtid_misc.cc:34)
==3390==    by 0xBF47CC: __libc_csu_init (in /home/stewart/percona-server/5.6/Percona-Server/build/sql/mysqld-debug)
==3390==    by 0x62C96FF: (below main) (libc-start.c:185)

Clearly, this isn't ideal. basically, it prevents even mtr from starting
the debug server to find what version it is.

The following patch solves the problem for me, basicall, if the thread
specific value hasn't been inited, let's do the right thing and return
NULL. Everything seems to work okay and I believe this to be correct.

Another option could be to not have the global_sid_map in
rpl_gtid_misc.cc be inited as part of setup, or have it not use
DBUG... but this is possibly undesirable and could hide the problem in
the future.

How to repeat:
eatmydata ./mysql-test-run.pl --debug-server

(will crash on getting version number. This is basically it calling ./mysqld-debug --help --verbose)

Suggested fix:
I make no (C) claim over the following

--- a/mysys/my_thr_init.c	2011-08-10 06:26:45 +0000
+++ b/mysys/my_thr_init.c	2012-07-19 04:52:08 +0000
@@ -458,8 +458,12 @@ const char *my_thread_name(void)
 
 extern void **my_thread_var_dbug()
 {
-  struct st_my_thread_var *tmp=
-    my_pthread_getspecific(struct st_my_thread_var*,THR_KEY_mysys);
+  struct st_my_thread_var *tmp;
+
+  if (!my_thread_global_init_done)
+    return NULL;
+
+  tmp= my_pthread_getspecific(struct st_my_thread_var*,THR_KEY_mysys);
   return tmp && tmp->init ? &tmp->dbug : 0;
 }
 #endif /* DBUG_OFF */
[24 Jul 2012 17:12] Sveta Smirnova
Thank you for the report.

I can not repeat described behavior. Which exact package and Linux distribution do you use?
[25 Jul 2012 0:10] Stewart Smith
I've only directly observed it on Ubuntu 12.04 while LD_PRELOADing libeatmydata. So, this isn't typical. Through code analysis I determined that this is possible, and if it's not fixed it will likely just bite somebody further down the track.

I could also see (rather odd) valgrind warnings in this setup.
[25 Jul 2012 18:51] Sveta Smirnova
Thank you for the feedback.

Verified as described.

Repeatable only on Ubuntu with binaries built as:

cmake . -DCMAKE_INSTALL_PREFIX=$HOME/build/mysql-5.6.5 -DWITH_DEBUG=1
[31 Jul 2012 13:45] Erlend Dahl
This is a duplicate of an internally reported bug which has been fixed on the latest mysql-trunk.
[5 Dec 2013 8:32] Laurynas Biveinis
See bug 69653.