Description:
F you are really lucky (or unlucky), the order in which things are
initialized on server startup goes a bit diferent than you'd expect. I
managed to get this to reproduce by always LD_PRELOADING
libeatmydata. This may also occur if LD_PRELOADING other libraries, but
I haven't tested.
Namely, we get the Sid_map::Sid_map constructor firing before
my_thread_global_init() is run. What happens here is that
Sid_map::Sid_map calls a DBUG function, which will try to get a thread
specific variable for THR_KEY_mysys, but the thread key hasn't yet been
initialized, so instead we get a wonderful pile of garbage.
The end result, a crash:
#0 0x0000000000b8e825 in _db_enter_ (
_func_=<error reading variable: Cannot access memory at address 0xc>,
_file_=0xcbe490 "dno, const rpl_sid *)", _line_=27,
_stack_frame_=0x7fff2d95d1f0)
at dbug/dbug.c:1112
#1 0x000000000084741b in Sid_map::Sid_map (this=0x140e2e0,
_sid_lock=<optimized out>)
at sql/rpl_gtid_sid_map.cc:27
#2 0x0000000000847252 in __static_initialization_and_destruction_0 (
__priority=65535, __initialize_p=1)
at sql/rpl_gtid_misc.cc:34
#3 _GLOBAL__sub_I_rpl_gtid_misc.cc(void) ()
at sql/rpl_gtid_misc.cc:174
#4 0x0000000000bf481d in __do_global_ctors_aux ()
Running under Valgrind gives you instead:
==3390== Invalid read of size 1
==3390== at 0x636D1D2: internal_fnmatch (fnmatch_loop.c:69)
==3390== by 0x636E9EC: fnmatch@@GLIBC_2.2.5 (fnmatch.c:460)
==3390== by 0xB8D804: InList (dbug.c:1501)
followed by:
==3390== General Protection Fault
==3390== at 0x636D1D2: internal_fnmatch (fnmatch_loop.c:69)
==3390== by 0x636E9EC: fnmatch@@GLIBC_2.2.5 (fnmatch.c:460)
==3390== by 0xB8D804: InList (dbug.c:1501)
==3390== by 0xB8D871: DoTrace (dbug.c:1670)
==3390== by 0xB8E84C: _db_enter_ (dbug.c:1115)
==3390== by 0x84741A: Sid_map::Sid_map(Checkable_rwlock*) (rpl_gtid_sid_map.cc:27)
==3390== by 0x847251: _GLOBAL__sub_I_rpl_gtid_misc.cc (rpl_gtid_misc.cc:34)
==3390== by 0xBF47CC: __libc_csu_init (in /home/stewart/percona-server/5.6/Percona-Server/build/sql/mysqld-debug)
==3390== by 0x62C96FF: (below main) (libc-start.c:185)
Clearly, this isn't ideal. basically, it prevents even mtr from starting
the debug server to find what version it is.
The following patch solves the problem for me, basicall, if the thread
specific value hasn't been inited, let's do the right thing and return
NULL. Everything seems to work okay and I believe this to be correct.
Another option could be to not have the global_sid_map in
rpl_gtid_misc.cc be inited as part of setup, or have it not use
DBUG... but this is possibly undesirable and could hide the problem in
the future.
How to repeat:
eatmydata ./mysql-test-run.pl --debug-server
(will crash on getting version number. This is basically it calling ./mysqld-debug --help --verbose)
Suggested fix:
I make no (C) claim over the following
--- a/mysys/my_thr_init.c 2011-08-10 06:26:45 +0000
+++ b/mysys/my_thr_init.c 2012-07-19 04:52:08 +0000
@@ -458,8 +458,12 @@ const char *my_thread_name(void)
extern void **my_thread_var_dbug()
{
- struct st_my_thread_var *tmp=
- my_pthread_getspecific(struct st_my_thread_var*,THR_KEY_mysys);
+ struct st_my_thread_var *tmp;
+
+ if (!my_thread_global_init_done)
+ return NULL;
+
+ tmp= my_pthread_getspecific(struct st_my_thread_var*,THR_KEY_mysys);
return tmp && tmp->init ? &tmp->dbug : 0;
}
#endif /* DBUG_OFF */