| Bug #37093 | mysqld got signal 11 when slave sql_thread reads certain relay log entries | ||
|---|---|---|---|
| Submitted: | 30 May 2008 8:11 | Modified: | 11 Jun 2008 12:08 |
| Reporter: | Rickard Gunnarsson | Email Updates: | |
| Status: | Duplicate | Impact on me: | |
| Category: | MySQL Server: Replication | Severity: | S1 (Critical) |
| Version: | 5.1.24 | OS: | Linux (Ubuntu Server 6.06) |
| Assigned to: | CPU Architecture: | Any | |
[31 May 2008 17:32]
Sveta Smirnova
Thank you for the report. Please provide us your relay and/or binary log file[s].
[2 Jun 2008 7:33]
Rickard Gunnarsson
I found out that what's causing this behavior is when there's a trigger on the table being updated. Removing the trigger completely will make this problem disappear. Trying drop trigger and re-create the trigger doesn't solve the problem.
[2 Jun 2008 13:02]
Sveta Smirnova
Thank you for the report. Could you please provide definition of trigger?
[2 Jun 2008 14:10]
Rickard Gunnarsson
CREATE TRIGGER timediff AFTER INSERT ON tbl_week_ticks FOR EACH ROW BEGIN IF ABS((unix_timestamp(now())-unix_timestamp(NEW.dt_timestamp))) > 20 THEN INSERT INTO tbl_delayed_ticks SET i_id=NEW.i_id, i_timediff=(unix_timestamp(now())-unix_timestamp(NEW.dt_timestamp)); END IF; END;
[2 Jun 2008 16:48]
Sveta Smirnova
Thank you for the feedback. Please provide output of SHOW CREATE TABLE tbl_delayed_ticks also.
[2 Jun 2008 18:35]
Rickard Gunnarsson
mysql> show create table tbl_delayed_ticks\G
*************************** 1. row ***************************
Table: tbl_delayed_ticks
Create Table: CREATE TABLE `tbl_delayed_ticks` (
`i_id` int(10) unsigned NOT NULL DEFAULT '0',
`i_timediff` int(11) DEFAULT '0',
PRIMARY KEY (`i_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
[3 Jun 2008 19:09]
Sveta Smirnova
Thank you for the feedback. I can not repeat the problem in test environment. Please provide configuration file also and describe your hardware (CPU, RAM).
[5 Jun 2008 12:46]
Rickard Gunnarsson
This is the my.cnf file, and the output of uname -a, cat /proc/cpuinfo and cat /proc/meminfo. Please let me know if there's anything else you need. [client] port = 3306 socket = /var/run/mysqld/mysqld.sock [mysqld_safe] socket = /var/run/mysqld/mysqld.sock nice = 0 user = mysql pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /usr/local/mysql datadir = /var/lib/mysql tmpdir = /tmp language = /usr/share/mysql/english skip-external-locking old_passwords = 1 key_buffer = 16M max_allowed_packet = 16M thread_stack = 128K max_connections = 200 query_cache_limit = 1048576 query_cache_size = 16777216 query_cache_type = 1 server-id = 1 log-bin = mysql-bin expire-logs-days = 7 binlog-do-db = db_admin binlog-do-db = db_ticks auto_increment_increment = 2 auto_increment_offset = 1 report-host=x.y.z.w innodb_buffer_pool_size=500M [mysqldump] quick quote-names max_allowed_packet = 16M [mysql] [isamchk] key_buffer = 16M root@xxxxxxxx:~# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU 6320 @ 1.86GHz stepping : 6 cpu MHz : 1862.063 cache size : 4096 KB physical id : 0 siblings : 1 core id : 255 cpu cores : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm bogomips : 3727.73 processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU 6320 @ 1.86GHz stepping : 6 cpu MHz : 1862.063 cache size : 4096 KB physical id : 1 siblings : 1 core id : 255 cpu cores : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm bogomips : 3724.07 root@xxxxxxx:~# uname -a Linux xxxxxx 2.6.15-26-server #1 SMP Fri Sep 8 21:00:37 UTC 2006 i686 GNU/Linux root@xxxxxxxx:~# cat /proc/meminfo MemTotal: 2074944 kB MemFree: 101020 kB Buffers: 74848 kB Cached: 1209832 kB SwapCached: 1200 kB Active: 877900 kB Inactive: 1039192 kB HighTotal: 1178496 kB HighFree: 18952 kB LowTotal: 896448 kB LowFree: 82068 kB SwapTotal: 5960072 kB SwapFree: 5950148 kB Dirty: 1836 kB Writeback: 4 kB Mapped: 641364 kB Slab: 42864 kB CommitLimit: 6997544 kB Committed_AS: 791364 kB PageTables: 2184 kB VmallocTotal: 118776 kB VmallocUsed: 5392 kB VmallocChunk: 112796 kB
[10 Jun 2008 14:34]
MySQL Verification Team
i tried to repeat a crash, and also failed to do so. now, suspect this bug may be caused by the fix for bug #33029 just guess it's this code crashing: function THD::reset_sub_statement_state if (rpl_master_erroneous_autoinc(this)) { backup->auto_inc_intervals_forced= auto_inc_intervals_forced; auto_inc_intervals_forced.empty(); <---- this }
[10 Jun 2008 14:50]
MySQL Verification Team
btw, i just opened bug #37311 while i was trying to reproduce this crash.
[11 Jun 2008 5:18]
MySQL Verification Team
looks like bug #36443
[11 Jun 2008 12:08]
MySQL Verification Team
marking as a duplicate of bug #36443

Description: When upgrading from 5.0.45 to 5.1.24 (also tried a build of 5.1.26 with same result) replication fails when the slave sql_thread is started. When starting the thread with "slave start sql_thread;" the relay log is being processed but then stops when mysqld segfault. This is the output of the error log: 080530 8:59:44 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000725' at position 450859142, relay log './mysqld-relay-bin.000002' position: 152163 /usr/local/mysql/bin/mysqld(print_stacktrace+0x1c)[0x82d0040] /usr/local/mysql/bin/mysqld(handle_segfault+0x1cb)[0x81c84f3] [0xffffe420] /usr/local/mysql/bin/mysqld(_ZN3THD27restore_sub_statement_stateEP19Sub_statement_state+0xb4)[0x81be818] /usr/local/mysql/bin/mysqld(_ZN19Table_triggers_list16process_triggersEP3THD14trg_event_type20trg_action_time_typeb+0xe5)[0x82ee461] /usr/local/mysql/bin/mysqld(_Z12write_recordP3THDP8st_tableP12st_copy_info+0x431)[0x8232c8d] /usr/local/mysql/bin/mysqld(_Z12mysql_insertP3THDP10TABLE_LISTR4ListI4ItemERS3_IS5_ES6_S6_15enum_duplicatesb+0xade)[0x823231a] /usr/local/mysql/bin/mysqld(_Z21mysql_execute_commandP3THD+0x1c36)[0x81d57ce] /usr/local/mysql/bin/mysqld(_Z11mysql_parseP3THDPKcjPS2_+0xed)[0x81d9e89] /usr/local/mysql/bin/mysqld(_ZN15Query_log_event14do_apply_eventEPK14Relay_log_infoPKcj+0x3ac)[0x8260904] /usr/local/mysql/bin/mysqld(_ZN15Query_log_event14do_apply_eventEPK14Relay_log_info+0x18)[0x8260554] /usr/local/mysql/bin/mysqld(_Z26apply_event_and_update_posP9Log_eventP3THDP14Relay_log_infob+0x11e)[0x82c29ea] /usr/local/mysql/bin/mysqld[0x82c4e34] /usr/local/mysql/bin/mysqld(handle_slave_sql+0x47e)[0x82c37fe] /lib/tls/i686/cmov/libpthread.so.0[0xb7f17341] /lib/tls/i686/cmov/libc.so.6(__clone+0x5e)[0xb7e444ee] 080530 8:59:44 - mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=16777216 read_buffer_size=131072 max_used_connections=3 max_threads=200 threads_connected=1 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 452839 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. thd: 0x95ad090 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... Trying to get some variables. Some pointers may be invalid and cause the dump to abort... thd->query at 0x9828154 = INSERT IGNORE INTO tbl_week_ticks SET i_unit_id = 796,i_customer_id = 1074,d_lat = 123,d_lon = 123,d_velocity = 0,d_heading = 0,b_validpos = 0,i_sensor_input = 0,i_socket_id = 11,dt_timestamp = '2008-05-28 22:22:29',i_distance = 717930 thd->thread_id=1842 thd->killed=NOT_KILLED Restarting the slave makes mysqld to die immediately, without continue processing the relay log. I therefore suspected that the statement itself would cause the crash... Looking at the relay log at the position where the crash occurred, I'm able to execute the statements in the log manually. I tried changing the relay_log_pos with "change master to relay_log_pos=xxx;" to set the relay log pos to the position just *after* the failed position. When starting the slave again, the processing of the relay log continues, only to die again a bit further down the log. Again at a similar "insert ignore into..." statement. Important to notice is that between these two positions in the relay log there are several insert ignore into statements that regards other tables. This is the table layout for the affected table: mysql> show create table tbl_week_ticks\G *************************** 1. row *************************** Table: tbl_week_ticks Create Table: CREATE TABLE `tbl_week_ticks` ( `i_id` int(10) unsigned zerofill NOT NULL AUTO_INCREMENT, `i_unit_id` int(11) DEFAULT '0', `i_customer_id` smallint(5) unsigned DEFAULT '0', `d_lat` double DEFAULT '0', `d_lon` double DEFAULT '0', `d_velocity` double DEFAULT '0', `d_heading` double DEFAULT '0', `b_validpos` tinyint(1) DEFAULT '0', `dt_timestamp` datetime DEFAULT '0000-00-00 00:00:00', `i_sensor_input` smallint(6) DEFAULT NULL, `i_socket_id` int(11) DEFAULT '0', `i_distance` int(10) unsigned NOT NULL DEFAULT '0', `i_sats_used` tinyint(1) unsigned DEFAULT '0', `d_altitude` double DEFAULT '0', PRIMARY KEY (`i_id`), KEY `Unit` (`i_unit_id`,`dt_timestamp`), KEY `Customer` (`i_customer_id`,`dt_timestamp`), KEY `Index_timestamp` (`dt_timestamp`) ) ENGINE=InnoDB AUTO_INCREMENT=108851865 DEFAULT CHARSET=latin1 COMMENT='Temporary; InnoDB free: 7102464 kB; InnoDB free: 94208 kB' 1 row in set (0.00 sec) Downgrading again to 5.0.45 and everything works fine. The master server is still 5.0.45. How to repeat: Upgrade 5.0.45 to 5.1.24 or 5.1.26 Start replication from a 5.0.45 server.