Bug #25526 | mysqld got signal 11 and crashes (0x746f6f72) | ||
---|---|---|---|
Submitted: | 10 Jan 2007 16:50 | Modified: | 20 Mar 2007 16:03 |
Reporter: | Matthias Albert | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S1 (Critical) |
Version: | 5.0.27 | OS: | Linux (SLES 9 SP3 (x86_64)) |
Assigned to: | Assigned Account | CPU Architecture: | Any |
[10 Jan 2007 16:50]
Matthias Albert
[10 Jan 2007 16:52]
Matthias Albert
my.cnf
Attachment: my.cnf (application/octet-stream, text), 2.12 KiB.
[10 Jan 2007 17:57]
Heikki Tuuri
Matthias, I am afraid that the x86_64 binary does not print a sensible stack trace. Can you compile a debug version of mysqld yourself, and run it inside gdb? When it crashes, do: (gdb) bt full That usually tells where the bug is. Regards, Heikki
[16 Jan 2007 9:45]
Matthias Albert
Hello Heikki, thx for your answer and sorry for my delay (I was in holidays for 3 days). I've compiled a debug version of mysql (from this source rpm: MySQL-5.0.27-0.glibc23.src.rpm and enabled debugging (--with-debug). After installation of these new packages -> the problem seems to be gone :-). Probably it is a build problem of mysql. I've build the new package on a machine which is similiar to my productive mysql machines. So perhaps it is a, SLES 9 ServicePack 1-2-3 problem or so or anything else which isn't compatible with our hardware/os. However, at the moment, I've no machine (>20) which has crashed over the last 4 days. And now? Closing this bug? Debugging the build process? Cheers, Matthias
[16 Jan 2007 13:58]
Heikki Tuuri
Matthias, the bug might be something that is masked in the debug build. If you build a normal production binary, does it crash? Regards, Heikki
[17 Jan 2007 8:54]
Matthias Albert
Hello Heikki, this night I had two more mysql crashes WITH the debug version. But the output of "resolve_stack_dump" isn't really helpful. Host A: resolve_stack_dump -s /usr/lib64/mysql/mysqld.sym -n /tmp/my.stack 0x746f6f72 _end + 1941324714 Host B: resolve_stack_dump -s /usr/lib64/mysql/mysqld.sym -n /tmp/my.stack 0x76f6f72 _end + 112608170 I will attach the mysql-error.log file Any idea? Cheers, Matthias
[17 Jan 2007 9:46]
Matthias Albert
Today I will start 2-3 mysql daemons inside gdb so we can get a qualified dump if it crashse the next time.
[17 Jan 2007 13:28]
Heikki Tuuri
Matthias, yes, when it crashes inside gdb, do: (gdb)bt full It may help to print also other 'interesting' thread stack traces. With: (gdb) info threads you see the threads. Then: (gdb) thread 15 (gdb) bt full prints the stack trace of thread 15. You can also print the values of some 'interesting' variables in the crash. With: (gdb) frame 10 you can go to stack frame 10 of the current thread, and print some values: (gdb) print *thd etc. I cannot say beforehand what values are 'interesting' :). Regards, Heikki
[18 Jan 2007 9:24]
Matthias Albert
Hello Heikki, I have good news :). Now I have some backtraces from my mysql servers. I hope you can use this to identify the problem. If you need some more information -> let me know it. Cheers, Matthias ---snip--- Host A: Program received signal SIGUSR1, User defined signal 1. [Switching to Thread 1154328928 (LWP 15674)] 0x00002adafc97638f in __read_nocancel () from /lib64/tls/libpthread.so.0 (gdb) bt full #0 0x00002adafc97638f in __read_nocancel () from /lib64/tls/libpthread.so.0 No symbol table info available. #1 0x00000000007a82e0 in vio_read () No symbol table info available. #2 0x00000000005421e7 in net_realloc () No symbol table info available. #3 0x0000000000542847 in my_net_read () No symbol table info available. #4 0x000000000056d3a5 in do_command () No symbol table info available. #5 0x000000000056de54 in handle_one_connection () No symbol table info available. #6 0x00002adafc971b8f in start_thread () from /lib64/tls/libpthread.so.0 No symbol table info available. #7 0x00002adafcfe44b3 in clone () from /lib64/tls/libc.so.6 No symbol table info available. #8 0x0000000000000000 in ?? () No symbol table info available. ---snap--- ---snip--- Host B: Program received signal SIGUSR1, User defined signal 1. [Switching to Thread 1147705696 (LWP 10850)] 0x00002b76f33a238f in __read_nocancel () from /lib64/tls/libpthread.so.0 (gdb) bt full #0 0x00002b76f33a238f in __read_nocancel () from /lib64/tls/libpthread.so.0 No symbol table info available. #1 0x00000000007a82e0 in vio_read () No symbol table info available. #2 0x00000000005421e7 in net_realloc () No symbol table info available. #3 0x0000000000542847 in my_net_read () No symbol table info available. #4 0x000000000056d3a5 in do_command () No symbol table info available. #5 0x000000000056de54 in handle_one_connection () No symbol table info available. #6 0x00002b76f339db8f in start_thread () from /lib64/tls/libpthread.so.0 No symbol table info available. #7 0x00002b76f3a104b3 in clone () from /lib64/tls/libc.so.6 No symbol table info available. #8 0x0000000000000000 in ?? () No symbol table info available. ---snap--- Host C: ---snip--- Program received signal SIGUSR1, User defined signal 1. [Switching to Thread 1154328928 (LWP 15674)] 0x00002adafc97638f in __read_nocancel () from /lib64/tls/libpthread.so.0 (gdb) bt full #0 0x00002adafc97638f in __read_nocancel () from /lib64/tls/libpthread.so.0 No symbol table info available. #1 0x00000000007a82e0 in vio_read () No symbol table info available. #2 0x00000000005421e7 in net_realloc () No symbol table info available. #3 0x0000000000542847 in my_net_read () No symbol table info available. #4 0x000000000056d3a5 in do_command () No symbol table info available. #5 0x000000000056de54 in handle_one_connection () No symbol table info available. #6 0x00002adafc971b8f in start_thread () from /lib64/tls/libpthread.so.0 No symbol table info available. #7 0x00002adafcfe44b3 in clone () from /lib64/tls/libc.so.6 No symbol table info available. #8 0x0000000000000000 in ?? () No symbol table info available. (gdb) quit The program is running. Quit anyway (and detach it)? (y or n) n Not confirmed. ---snap---
[18 Jan 2007 15:37]
Heikki Tuuri
Matthias, sorry, I should have given you this link: http://dev.mysql.com/doc/refman/5.0/en/using-gdb-on-mysqld.html those SIGUSR1 are completely normal events. " If you are using gdb 4.17.x or above on Linux, you should install a .gdb file, with the following information, in your current directory: set print sevenbit off handle SIGUSR1 nostop noprint handle SIGUSR2 nostop noprint handle SIGWAITING nostop noprint handle SIGLWP nostop noprint handle SIGPIPE nostop handle SIGALRM nostop handle SIGHUP nostop handle SIGTERM nostop noprint " You need the .gdb file to ignore those unnecessary stops. You should also compile mysqld with: CFLAGS="-O3 -g" CXXFLAGS="-O3 -g" ./configure so that you get the debug info into the mysqld binary, and gdb can show a stack trace with rich information. Regards, Heikki
[30 Jan 2007 12:11]
Matthias Albert
Hello Heikki, today we have catched the mysql with gdb in error-condition. When i follow your guidance to debug the process, i have some ambiguity here and there. 1. Calling info threads results hundreds of thread-references. How can i identify the thread which is relating to our problem? 2. Calling "where" and "bt full" results the same stack as descriptive as bug-entry at 18 Jan 10:24 from Matthias 3. The mysqld-bin doesn't contain the symbol-table. Instead of, the symbols are listed in /usr/lib64/mysql/mysqld.sym. How can i load the symbols table into gdb? "symbol-file" works only on binaries. many thanks in advance. Stefan
[13 Feb 2007 17:14]
Heikki Tuuri
Matthias, I am sorry for the delay in replying. When you do the info threads command, you will find various threads in various functions. You may find that the 'interesting' ones are stuck in a different function. Usually, the 'interesting' threads have a deep call stack: 15 frames or more. The brute force approach is to print all thread stacks. Regards, Heikki
[14 Mar 2007 0:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[15 Mar 2007 19:23]
Matthias Albert
GDB backtrace full
Attachment: gdb_mysql_bt_full.txt (text/plain), 55.89 KiB.
[15 Mar 2007 19:38]
Matthias Albert
Hello Heikki, a long time later we had a crashed mysql again, at last. I have attached the backtrace to the bug. The mysql-process is still in errror-condition with attached gdb. If required, you can get more traces. Just send me some instructions, how to attent upon gdb to get the needed informations. Many thanks in advance, Stefan
[20 Mar 2007 15:26]
MySQL Verification Team
heikki, i wonder if this is related to the new bug #27294 note the crash point is the same, and both cases have srv_locks_unsafe_for_binlog=1.
[20 Mar 2007 15:41]
Heikki Tuuri
Shane, indeed, this looks the same crash. The bug is that there is no guarantee that prebuilt->trx is a sensible pointer in this function! prebuilt->trx->isolation_level != TRX_ISO_SERIALIZABLE && Regards, Heikki
[20 Mar 2007 16:03]
Heikki Tuuri
Duplicate of http://bugs.mysql.com/bug.php?id=27294