| Bug #50713 | mysqld got signal 11 , segfault | ||
|---|---|---|---|
| Submitted: | 29 Jan 2010 6:42 | Modified: | 25 Aug 2010 16:56 |
| Reporter: | Alec Matusis | Email Updates: | |
| Status: | No Feedback | Impact on me: | |
| Category: | MySQL Server: General | Severity: | S1 (Critical) |
| Version: | 5.0.87, 5.0.90 | OS: | Linux (2.6.31 Ubuntu 9.10-server) |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | segfault, segmentation fault | ||
[30 Jan 2010 8:13]
Alec Matusis
I was mistaken about rolling back to 2.6.24 fixing this. This occurs even on 2.6.24 (Ubuntu 8.04). I tried 5.0.90, the bug still occurs. I copied this data from another slave, still the same bug.
[3 Feb 2010 8:06]
Sveta Smirnova
Thank you for the report. Do you have option --innodb-file-per-table enabled? Please provide your configuration file?
[3 Feb 2010 19:22]
Alec Matusis
--innodb-file-per-table is enabled /etc/my.cnf : [client] port = 3306 socket = /tmp/mysql.sock [mysqld] user = mysql port = 3306 socket = /tmp/mysql.sock skip-locking innodb_file_per_table key_buffer = 6400M table_cache = 2048 sort_buffer_size = 8M myisam_sort_buffer_size = 8M max_binlog_cache_size = 4G max_join_size = 4294967295 max_seeks_for_key = 4294967295 max_write_lock_count = 4294967295 myisam_max_sort_file_size = 2G datadir = /data6 log-error = /log6/mysqld.log query_cache_type = 1 query_cache_size = 64M log-slow-queries long_query_time = 1 thread_cache_size = 32 max_connections = 1024 innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT innodb_max_dirty_pages_pct = 25 innodb_support_xa = 0 innodb_thread_concurrency = 32 innodb_commit_concurrency = 64 innodb_concurrency_tickets = 8 log-bin = /log6/db6-log-bin server-id = 10 relay-log = /log6/db6-relay-bin skip-slave-start log-slave-updates replicate-do-db = im tmpdir = /data6/tmp innodb_data_home_dir = /data6 innodb_log_group_home_dir = /log6 innodb_log_arch_dir = /arch6 innodb_buffer_pool_size = 19200M innodb_log_file_size = 50M [mysqldump] quick max_allowed_packet = 16M [mysql] no-auto-rehash [isamchk] key_buffer = 20M sort_buffer_size = 20M read_buffer = 2M write_buffer = 2M [myisamchk] key_buffer = 20M sort_buffer_size = 20M read_buffer = 2M write_buffer = 2M [mysqlhotcopy] interactive-timeout
[3 Feb 2010 19:59]
Sveta Smirnova
Thank you for the feedback. > --innodb-file-per-table is enabled Have you copied whole datadir or only ib* files? Have you placed all *ibd files into subdirectories named same like in the original locations?
[3 Feb 2010 20:20]
Alec Matusis
>Have you copied whole datadir or only ib* files? >Have you placed all *ibd files into subdirectories named same like in the original locations? Yes, whole data directory including all .ibd, .frm and .opt files were copied and put in the exact same locations with rsync. ib* files were also copied. We cloned MANY slaves in the past using this exact copying process, and we saw the crash for the first time on this specific dataset.
[23 Jul 2010 15:21]
Sveta Smirnova
Thank you for the feedback. > We cloned MANY slaves in the past using this exact copying process, and we saw the crash for the first time on this specific dataset. Have you verified this specific dataset for hardware errors? Specifically hard disk and memory.
[23 Jul 2010 18:24]
Charlton Kao
Alec, what kind of server do you have? Are you on the latest bios revision?
[25 Aug 2010 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".

Description: When setting up a new slave (copied InnoDB .frm .ibd ib_logfile0 ib_logfile1, etc from another slave), and starting new slave, mysql crashes like this: InnoDB: The InnoDB memory heap is disabled InnoDB: use atomic builtins. InnoDB: Log scan progressed past the checkpoint lsn 345 3140792695 100128 21:31:33 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... InnoDB: Doing recovery: scanned up to log sequence number 345 3146035200 InnoDB: Doing recovery: scanned up to log sequence number 345 3151278080 InnoDB: Doing recovery: scanned up to log sequence number 345 3156520960 InnoDB: Doing recovery: scanned up to log sequence number 345 3161763840 InnoDB: Doing recovery: scanned up to log sequence number 345 3167006720 InnoDB: Doing recovery: scanned up to log sequence number 345 3172249600 InnoDB: Doing recovery: scanned up to log sequence number 345 3177492480 InnoDB: Doing recovery: scanned up to log sequence number 345 3182735360 InnoDB: Doing recovery: scanned up to log sequence number 345 3187978240 InnoDB: Doing recovery: scanned up to log sequence number 345 3193221120 InnoDB: Doing recovery: scanned up to log sequence number 345 3198464000 InnoDB: Doing recovery: scanned up to log sequence number 345 3203706880 InnoDB: Doing recovery: scanned up to log sequence number 345 3203925571 100128 21:31:37 InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percents: 100128 21:31:38 - mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=0 read_buffer_size=131072 max_used_connections=0 max_connections=1024 threads_connected=0 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 8519680 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. thd=(nil) Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... Cannot determine thread, fp=0x40000, backtrace may not be correct. Bogus stack limit or frame pointer, fp=0x40000, stack_bottom=0x2afd969e0000, thread_stack=262144, aborting backtrace. This slave is identical to 4 other slaves that have been successfully configured this exact way, EXPECT that the other slaves were on Ubuntu 8.04-server (2.6.24), and this one is on 2.6.31 Some additional info: root@db6:/log6# free total used free shared buffers cached Mem: 33006912 12679028 20327884 0 264644 11361056 -/+ buffers/cache: 1053328 31953584 Swap: 4883752 0 4883752 root@db6:/log6# cat /proc/cpuinfo .... processor : 15 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz identical my.cnf to other slaves that work strace: ..... 24023 futex(0x3770220, FUTEX_WAKE_PRIVATE, 1) = 0 24023 futex(0x377025c, FUTEX_WAIT_PRIVATE, 1603, NULL <unfinished ...> 24033 <... pread resumed> "\221t\1\30\0\0Gf\0\0\r\200\0\0\22\264\0\0\1T`} `E\277\0\0\0\0\0\0"..., 16384, 299466752) = 16384 24033 futex(0x377025c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x3770220, 1604) = 1 24033 sched_yield( <unfinished ...> 24023 <... futex resumed> ) = 0 24023 futex(0x3770220, FUTEX_WAKE_PRIVATE, 1) = 0 24023 futex(0x377025c, FUTEX_WAIT_PRIVATE, 1605, NULL <unfinished ...> 24032 <... pread resumed> "\226\337)u\0\0@\1\0\0\0\0\0\0\0\0\0\0\1Y\272*\244\353\0\0\0\0\0\0\0\0"..., 16384, 268451840) = 16384 24032 pread(20, <unfinished ...> 24033 <... sched_yield resumed> ) = 0 24033 futex(0x1f2bd78c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...> 24034 <... pread resumed> ")\6Ai\0\0Gi\0\0003?\0\0<\362\0\0\1Yg8.\333E\277\0\0\0\0\0\0"..., 16384, 299515904) = 16384 24034 futex(0x377025c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x3770220, 1606) = 1 24034 sched_yield( <unfinished ...> 24023 <... futex resumed> ) = 0 24023 futex(0x3770220, FUTEX_WAKE_PRIVATE, 1) = 0 24023 futex(0x377025c, FUTEX_WAIT_PRIVATE, 1607, NULL <unfinished ...> 24034 <... sched_yield resumed> ) = 0 24034 futex(0x1f2bd78c, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> 24029 <... pread resumed> ")-3\r\0\0Gt\0\0\6\306\0\0\31\245\0\0\1Yk\212\246\324E\277\0\0\0\0\0\0"..., 16384, 299696128) = 16384 24029 futex(0x377025c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x3770220, 1608) = 1 24029 sched_yield( <unfinished ...> 24023 <... futex resumed> ) = 0 24023 futex(0x3770220, FUTEX_WAKE_PRIVATE, 1) = 0 24023 futex(0x377025c, FUTEX_WAIT_PRIVATE, 1609, NULL <unfinished ...> 24028 <... pread resumed> "\263^\240M\0\0@\244\0\0#\356\0\0\1$\0\0\1Y\2715\354\357E\277\0\0\0\0\0\0"..., 16384, 271122432) = 16384 24028 futex(0x377025c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x3770220, 1610) = 1 24028 --- SIGSEGV (Segmentation fault) @ 0 (0) --- 24028 write(2, "InnoDB: Log scan progressed past"..., 2634) = 2634 24028 exit_group(1) = ? another attempt: ..... 24084 futex(0x318225c, FUTEX_WAIT_PRIVATE, 1677, NULL <unfinished ...> 24095 <... pread resumed> "\221t\1\30\0\0Gf\0\0\r\200\0\0\22\264\0\0\1T`} `E\277\0\0\0\0\0\0"..., 16384, 299466752) = 16384 24095 futex(0x318225c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x3182220, 1678) = 1 24095 sched_yield( <unfinished ...> 24091 <... pread resumed> "\226\337)u\0\0@\1\0\0\0\0\0\0\0\0\0\0\1Y\272*\244\353\0\0\0\0\0\0\0\0"..., 16384, 268451840) = 16384 24091 pread(20, <unfinished ...> 24084 <... futex resumed> ) = 0 24084 futex(0x3182220, FUTEX_WAKE_PRIVATE, 1) = 0 24084 futex(0x318225c, FUTEX_WAIT_PRIVATE, 1679, NULL <unfinished ...> 24095 <... sched_yield resumed> ) = 0 24095 futex(0x1eccfc0c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...> 24090 <... pread resumed> ")\6Ai\0\0Gi\0\0003?\0\0<\362\0\0\1Yg8.\333E\277\0\0\0\0\0\0"..., 16384, 299515904) = 16384 24090 futex(0x318225c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x3182220, 1680) = 1 24090 sched_yield( <unfinished ...> 24084 <... futex resumed> ) = 0 24084 futex(0x3182220, FUTEX_WAKE_PRIVATE, 1) = 0 24084 futex(0x318225c, FUTEX_WAIT_PRIVATE, 1681, NULL <unfinished ...> 24090 <... sched_yield resumed> ) = 0 24090 futex(0x1eccfc0c, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> 24093 <... pread resumed> "\263^\240M\0\0@\244\0\0#\356\0\0\1$\0\0\1Y\2715\354\357E\277\0\0\0\0\0\0"..., 16384, 271122432) = 16384 24093 futex(0x318225c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x3182220, 1682) = 1 24093 --- SIGSEGV (Segmentation fault) @ 0 (0) --- 24093 write(2, "InnoDB: Log scan progressed past"..., 2634) = 2634 24093 exit_group(1) Without copying InnodDB tables from another slave, mysql starts normally. InnoDB tables can then be created from SQL statements. How to repeat: copy InnoDB tables from a working slave, start mysql. Suggested fix: downgrading to ubuntu 8.04 2.6.24 fixes it.