Bug #48667 MySQL Proxy segfaults on startup
Submitted: 10 Nov 2009 12:59 Modified: 10 Nov 2009 20:33
Reporter: Hrunting Johnson Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Proxy: Core Severity:S1 (Critical)
Version:0.7.2 OS:Linux (Fedora)
Assigned to: CPU Architecture:Any
Tags: crash segfault

[10 Nov 2009 12:59] Hrunting Johnson
Description:
MySQL proxy, statically linked to glib 2.16.6, crashes on startup because a NULL mutex is passed pthread_mutex_trylock().  This mutex is the slab allocator mutex, which is not initialized.  It's not initialized because gthread functions are called before the gthread initialization function is called.  As of a minor release in glib 2.6 (not sure which one), the documented requirement that the gthread initialization function be called before any other gthread functions is an actual code requirement.

If you use an environment variable (sorry, my documentation isn't in front of me) to set the gslice allocator to always malloc, then the problem doesn't appear.  I suspect that in shared library situations, this may be the norm.

How to repeat:
Compile mysql proxy, statically linking against glib 2.16.6.
Run mysql proxy

Suggested fix:
Call the gthread initialization function before calling any other gthread functions, as documented in glib-2.0 documentation.
[10 Nov 2009 13:31] Kay Roepke
Did you really statically link to glib?
Did configure properly detect libgthread.so?

The code should contain the correct g_thread_init() call before doing anything multithreaded, but IIRC that is only enabled if you build with libgthread (which is required for the soon to be released 0.8 release anyway, because that is using threads internally).
[10 Nov 2009 14:57] Hrunting Johnson
Yes, it is really statically linked to glib.  I'll double-check that mysql-proxy detected gthread capabilities at configure time.

For information purposes, here is the backtrace:

#0  __pthread_mutex_trylock (mutex=0x0) at pthread_mutex_trylock.c:34
#1  0x000000000040f799 in g_mutex_trylock_posix_impl (mutex=0x0)
    at gthread-posix.c:187
#2  0x00000031bf42744e in IA__g_slice_alloc (mem_size=24) at gslice.c:395
#3  0x00000031bf427596 in IA__g_slice_alloc0 (mem_size=0) at gslice.c:833
#4  0x00000031bf41afa3 in network_queue_new () at network-socket.c:102
#5  0x00000031bf41afc3 in network_socket_new () at network-socket.c:222
#6  0x00002aaaaaabbbf3 in network_mysqld_admin_plugin_apply_config (
    chas=0x87af00, config=0x885a90) at admin-plugin.c:554
#7  0x00000031bf810c41 in chassis_mainloop (_chas=<value optimized out>)
    at chassis-mainloop.c:235
#8  0x000000000040d8cb in main_cmdline (argc=1, argv=0x7ffff1695e38)
    at chassis.c:993
#9  0x0000003f8521d8a4 in __libc_start_main (main=0x40daa0 <main>, argc=10,
    ubp_av=0x7ffff1695e38, init=<value optimized out>,
    fini=<value optimized out>, rtld_fini=<value optimized out>,
    stack_end=0x7ffff1695e28) at libc-start.c:231
#10 0x000000000040c6e9 in _start ()
[10 Nov 2009 15:54] Hrunting Johnson
I confirmed that during mysql proxy configure, gthread is detected, and HAVE_GTHREAD is defined.

checking for GLIB... yes
checking for GMODULE... yes
checking for GTHREAD... yes

Is there anything else I should check?
[10 Nov 2009 16:07] Kay Roepke
Strange, in mysql-proxy-0.7.2.tar.gz from Launchpad, I'm seeing

#ifdef HAVE_GTHREAD	
	g_thread_init(NULL);
#endif

in src/chassis.c:460. It is one of the first things main does.

We usually build it with dynamic linking, but I off-hand I can't see how this would affect this part.
Is it a strict requirement that you build it with statically linking to glib?
If not, I would try again with dynamic linking just to see if it makes a difference.
[10 Nov 2009 20:33] Hrunting Johnson
I'm embarrassed to say that it's the way in which the static linking was done.  The application was statically linked to glib, but it was still compiling shared mysql-proxy libraries that had their own static links to glib.

That was the cause.  In a way, g_thread_init() wasn't being called because the handlers had their own glib spaces.  My fault.  There really isn't a bug here except in how glib was being statically linked.
[11 Nov 2009 12:41] Kay Roepke
Ok, good to know.

Also, do not statically link to Lua, because it has global symbols and you will get weird crashes along the way.
The best option is to always dynamically link, there's hardly any overhead anyway (that's the reason for the wrapper script we ship, that sets the LD_LIBRARY_PATH, btw, because we normally ship all third-party libraries to avoid dependency hell on proxy-based products).

Thank you for following up on this.