Bug #24507 | rpl_log.test crash slave | ||
---|---|---|---|
Submitted: | 22 Nov 2006 14:17 | Modified: | 19 Jan 2007 3:39 |
Reporter: | Lars Thalmann | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) |
Version: | 5.0 | OS: | |
Assigned to: | Rafal Somla | CPU Architecture: | Any |
[22 Nov 2006 14:17]
Lars Thalmann
[23 Nov 2006 18:46]
Kristian Nielsen
Here is the stacktrace from the core: (gdb) bt #0 0x005227a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x007898bb in pthread_kill () from /lib/tls/libpthread.so.0 #2 0x084a1775 in write_core (sig=11) at stacktrace.c:245 #3 0x0826c995 in handle_segfault (sig=11) at mysqld.cc:2115 #4 <signal handler called> #5 0x00000000 in ?? () #6 0x0078d2aa in _Unwind_ForcedUnwind () from /lib/tls/libpthread.so.0 #7 0x0078af81 in __pthread_unwind () from /lib/tls/libpthread.so.0 #8 0x00786f00 in pthread_exit () from /lib/tls/libpthread.so.0 #9 0x084885d7 in handle_slave_io (arg=0x9f6c858) at slave.cc:3769 #10 0x00786341 in start_thread () from /lib/tls/libpthread.so.0 #11 0x006066fe in clone () from /lib/tls/libc.so.6 I have a plausible theory about what is going on. The crash is in this piece of code from glibc (as found by Google code search), where pthread_cancel_init() is inlined in _Unwind_Reason_Code(): _Unwind_Reason_Code _Unwind_ForcedUnwind (struct _Unwind_Exception *exc, _Unwind_Stop_Fn stop, void *stop_argument) { if (__builtin_expect (libgcc_s_forcedunwind == NULL, 0)) pthread_cancel_init (); return libgcc_s_forcedunwind (exc, stop, stop_argument); } void pthread_cancel_init (void) { void *resume, *personality, *forcedunwind, *getcfa; void *handle; if (__builtin_expect (libgcc_s_getcfa != NULL, 1)) return; handle = __libc_dlopen ("libgcc_s.so.1"); if (handle == NULL || (resume = __libc_dlsym (handle, "_Unwind_Resume")) == NULL || (personality = __libc_dlsym (handle, "__gcc_personality_v0")) == NULL || (forcedunwind = __libc_dlsym (handle, "_Unwind_ForcedUnwind")) == NULL || (getcfa = __libc_dlsym (handle, "_Unwind_GetCFA")) == NULL #ifdef ARCH_CANCEL_INIT || ARCH_CANCEL_INIT (handle) #endif ) __libc_fatal ("libgcc_s.so.1 must be installed for pthread_cancel to work\n"); libgcc_s_resume = resume; libgcc_s_personality = personality; libgcc_s_forcedunwind = forcedunwind; libgcc_s_getcfa = getcfa; } Note that there is actually a race in this code: - Thread A finds libgcc_s_forcedunwind==NULL and enters pthread_cancel_init(). A context switch then occurs before thread A has the time to check the libgcc_s_getcfa variable. - Thread B finds libgcc_s_forcedunwind==NULL and enters pthread_cancel_init(). It finds libgcc_s_getcfa==NULL, and goes to set libgcc_s_getcfa = getcfa. - Thread A is later re-scheduled, and now finds libgcc_s_getcfa!=NULL so returns immediately from pthread_cancel_init(). It then proceeds to execute the call (*libgcc_s_forcedunwind)() using the _old_ previously loaded value in %edx, which is still NULL. Hence a segfault. So the problem is that the libgcc_s_getcfa variable is checked and modified without any kind of synchronization. I actually found some evidence in the core file that this race is exactly what happened. Here are the registers at the point of crash: (gdb) info reg eax 0x78ade0 7908832 ecx 0xb6fa6dd0 -1225101872 edx 0x0 0 ebx 0x78fff4 7929844 esp 0xb6fa6328 0xb6fa6328 ebp 0xb6fa6348 0xb6fa6348 esi 0xb6fa6480 -1225104256 edi 0xb6fa6dd0 -1225101872 eip 0x78d2aa 0x78d2aa And here is the disassembly, with some comments. 0x0078d270 <_Unwind_ForcedUnwind+0>: push %ebp 0x0078d271 <_Unwind_ForcedUnwind+1>: mov %esp,%ebp 0x0078d273 <_Unwind_ForcedUnwind+3>: sub $0x20,%esp 0x0078d276 <_Unwind_ForcedUnwind+6>: mov %ebx,0xfffffff4(%ebp) 0x0078d279 <_Unwind_ForcedUnwind+9>: call 0x7852da <__i686.get_pc_thunk.bx> 0x0078d27e <_Unwind_ForcedUnwind+14>: add $0x2d76,%ebx 0x0078d284 <_Unwind_ForcedUnwind+20>: mov %esi,0xfffffff8(%ebp) 0x0078d287 <_Unwind_ForcedUnwind+23>: mov 0x21ac(%ebx),%edx libgcc_s_forcedunwind is now loaded in %edx. 0x0078d28d <_Unwind_ForcedUnwind+29>: mov %edi,0xfffffffc(%ebp) 0x0078d290 <_Unwind_ForcedUnwind+32>: test %edx,%edx 0x0078d292 <_Unwind_ForcedUnwind+34>: je 0x78d2b7 <_Unwind_ForcedUnwind+71> From the register dump, %edx is 0, so this jump is taken. 0x0078d294 <_Unwind_ForcedUnwind+36>: mov 0x10(%ebp),%esi 0x0078d297 <_Unwind_ForcedUnwind+39>: mov 0x8(%ebp),%edi 0x0078d29a <_Unwind_ForcedUnwind+42>: mov 0xc(%ebp),%eax 0x0078d29d <_Unwind_ForcedUnwind+45>: mov %esi,0x8(%esp) 0x0078d2a1 <_Unwind_ForcedUnwind+49>: mov %edi,(%esp) 0x0078d2a4 <_Unwind_ForcedUnwind+52>: mov %eax,0x4(%esp) 0x0078d2a8 <_Unwind_ForcedUnwind+56>: call *%edx This is where it crashes since %edx (libgcc_s_forcedunwind) is NULL. 0x0078d2aa <_Unwind_ForcedUnwind+58>: mov 0xfffffff4(%ebp),%ebx 0x0078d2ad <_Unwind_ForcedUnwind+61>: mov 0xfffffff8(%ebp),%esi 0x0078d2b0 <_Unwind_ForcedUnwind+64>: mov 0xfffffffc(%ebp),%edi 0x0078d2b3 <_Unwind_ForcedUnwind+67>: mov %ebp,%esp 0x0078d2b5 <_Unwind_ForcedUnwind+69>: pop %ebp 0x0078d2b6 <_Unwind_ForcedUnwind+70>: ret This is where the code jump to from above when it finds libgcc_s_forcedunwind to be NULL (it is the inlined pthread_cancel_init() code). 0x0078d2b7 <_Unwind_ForcedUnwind+71>: mov 0x21b0(%ebx),%eax 0x0078d2bd <_Unwind_ForcedUnwind+77>: test %eax,%eax 0x0078d2bf <_Unwind_ForcedUnwind+79>: jne 0x78d294 <_Unwind_ForcedUnwind+36> And here it returns immediately, since it loads libgcc_s_getcfa into %eax, and finds it non-NULL (the register dump shows %eax is 0x78ade0) -> crash. Some more dumps to show this: (gdb) x $ebx+0x21ac 0x7921a0 <libgcc_s_forcedunwind>: 0x009340e4 (gdb) x $ebx+0x21b0 0x7921a4 <libgcc_s_getcfa>: 0x00932b98 (gdb) x 0x009340e4 0x9340e4 <_Unwind_ForcedUnwind>: 0x57e58955 (gdb) x 0x00932b98 0x932b98 <_Unwind_GetCFA>: 0x8be58955 So the variable libgcc_s_forcedunwind is actually non-NULL at the time of crash (set by the other thread in the race). But the compiled code naturally uses the previously loaded value in %edx, having no reason to believe that it might have changed since it was last loaded. Hence the crash. <rest of disassembly omitted> As to how to fix it, I'm not sure ... it looks very much like a bug in glibc. We cannot really be expected to protect pthread_exit() with a mutex :-). I think we should file a glibc bug report actually... If we were to try to fix it, it is BTW going to be real fun trying to reproduce that race, especially given that it can only occur on the _first_ occurence of pthread_exit() in the process ...
[27 Nov 2006 18:11]
Rafal Somla
Apparently, the problem is known to glibc developers and already fixed. See: <http://sourceware.org/bugzilla/show_bug.cgi?id=2644>. We should upgrade to more recent version of glibc :)
[28 Nov 2006 8:37]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/15923 ChangeSet@1.2310, 2006-11-28 09:26:26+01:00, rafal@quant.(none) +1 -0 BUG#24507 (rpl_log.test crash slave): This patch introduces pthread_join() call when terminating slave I/O and SQL threads. This will make the main thread to wait until these threads are really finished avoiding possible race conditions in thread termination code. Using pthread_join() means that the threads should not be deteached. This change could also help with BUG#24387.
[28 Nov 2006 23:14]
Michael Widenius
I don't like the proposed solutions becasue of the following reason: - It changes the current code to use a pthread_join(); A function we don't use anywhere else and may have unknown problems, - We create and destroy threads in a lot of other places than in replication and this patch doesn't solve any of these problems. (The reason this shows up in replication tests is mainly becasue we don't have many test, except replications tests, that creates a lot of new connections/threads) - This is a glibc bug and we shouldn't have to change logic in the program becasue of a system library bug (that is already fixed) - Slave threads are destroyed in many other places than in terminate_slave_threads() (for example when they loose connection to master) and if we don't use detached threads we will use up resources for pthread_join() that will never be done. An alternative solution for the problem that is less intrusive: - On linux, lets on startup create a dummy thread that we then terminate with pthread_exit(0) at once. This will initialize the NTPL thread variables that causes the race condition and we will not get a crash anymore. After 1/2 a year, when all glibc versions are updated, we can delete this code. For the linux build machines, I recommend we upgrade to a newer glibc as soon there is a glibc available with the patch. For all staticly compile mysqld versions, this problem will then be solved.
[29 Nov 2006 17:12]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/16124 ChangeSet@1.2323, 2006-11-29 18:11:24+01:00, rafal@quant.(none) +2 -0 BUG#24507 (rpl_log.test crash slave): The problem was located to lie inside current NTPL pthread_exit() implementation. Race conditions in this code can lead to segmentation fault. Hovewer, this can happen only in a rece between first thread calling pthread_exit() and other threads. Workaround implemented in this patch spawns a dummy thread, which exits immediately, during thread lib initialization. This will exclude segment violations when further threads exit.
[8 Dec 2006 11:05]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/16642 ChangeSet@1.2323, 2006-12-08 11:41:12+01:00, rafal@quant.(none) +2 -0 BUG#24507 (rpl_log.test crash slave): The problem was located to lie inside current NPTL pthread_exit() implementation. Race conditions in this code can lead to segmentation fault. Hovewer, this can happen only in a race between first thread calling pthread_exit() and other threads. Workaround implemented in this patch spawns a dummy thread, which exits immediately, during thread lib initialization. This will exclude segment violations when further threads exit.
[18 Jan 2007 15:31]
Lars Thalmann
Pushed into 5.0.36, 5.1.15
[19 Jan 2007 3:39]
Paul DuBois
Noted in 5.0.36, 5.1.15 changelogs. A workaround was implemented to avoid a race condition in the NPTL pthread_exit() implementation.
[8 Sep 2009 15:36]
Kir Kolyshkin
Just to let people know and for better cross-referencing — the patch from this bug causes a regression which is a subject of bug #42850