Bug #61133 | Master -> Slave : duplicate key error after running fine for days | ||
---|---|---|---|
Submitted: | 11 May 2011 13:29 | Modified: | 20 Dec 2012 20:55 |
Reporter: | Thomas Förster | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 5.5.10 | OS: | Linux (SuSE Linux Enterprise 11.1) |
Assigned to: | CPU Architecture: | Any | |
Tags: | 1062, break, duplicate key, mass insert, replication |
[11 May 2011 13:29]
Thomas Förster
[19 May 2011 17:00]
Stephane Bakhos
We've been having a similar issue and found this way of it happening. We use statement based replication only. This is with mysql 5.1.49, I checked the changelog til 5.1.58 and didn't see anything that tells me it is fixed. According to the binlog it goes like this: #110514 7:54:49 server id 42 end_log_pos 6683606 Intvar SET INSERT_ID=993022101/*!*/; # at 6683606 #110514 7:54:49 server id 42 end_log_pos 6684073 Query thread_id=130928719 exec_time=0 error_code=0 SET TIMESTAMP=1305352489/*!*/; INSERT INTO table () VALUES () ... skip unrelated stuff ... #110514 7:54:48 server id 42 end_log_pos 6771858 Intvar SET INSERT_ID=993021441/*!*/; # at 6771858 SET TIMESTAMP=1305352488/*!*/; INSERT INTO table() VALUES (), (), () (about 950 rows added) ... skip unrelated stuff ... #110514 7:54:49 server id 42 end_log_pos 6914506 Query thread_id=130928719 exec_time=0 error_code=0 SET INSERT_ID=993031861/*!*/; SET TIMESTAMP=1305352489/*!*/; INSERT INTO table () VALUES () Autoincrement increases by 10 each time. The first query seems to use an autoincrement value that would have been allocated by the mass insert query. Our slaves also report the duplicate id.
[14 Jul 2011 19:24]
Sveta Smirnova
Thank you for the report. Please try with current version 5.5.14 and if problem still exists send us output of SHOW CREATE TABLE ABC and XYZ
[19 Jul 2011 10:03]
Thomas Förster
I've upgrade my master and slave to 5.5.14 and copied over all data from my master to the slave (via lvm snapshot). The mysql server on the slave ran fine after starting it until i configured it to act as a slave. After 5 minutes of doing what a slave needs to do mysql crashed. It keeps crashing immediately every time i restart mysql while recovering: 110719 11:56:07 InnoDB: The InnoDB memory heap is disabled 110719 11:56:07 InnoDB: Mutexes and rw_locks use GCC atomic builtins 110719 11:56:07 InnoDB: Compressed tables use zlib 1.2.3 110719 11:56:07 InnoDB: Using Linux native AIO 110719 11:56:07 InnoDB: Initializing buffer pool, size = 256.0M 110719 11:56:07 InnoDB: Completed initialization of buffer pool 110719 11:56:07 InnoDB: highest supported file format is Barracuda. InnoDB: Log scan progressed past the checkpoint lsn 720247826921 110719 11:56:07 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... InnoDB: Doing recovery: scanned up to log sequence number 720249568654 110719 11:56:19 InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 110719 11:56:19 - mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=67108864 read_buffer_size=8388608 max_used_connections=0 max_threads=20 thread_count=0 connection_count=0 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 557283 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = (nil) thread_stack 0x30000 /usr/local/bin/mysqld(my_print_stacktrace+0x33)[0x787b03] /usr/local/bin/mysqld(handle_segfault+0x37c)[0x510bac] /lib64/libpthread.so.0(+0xf5d0)[0x7f97105175d0] /usr/local/bin/mysqld[0x8c32f8] /usr/local/bin/mysqld[0x8c4b32] /usr/local/bin/mysqld[0x8b0ef1] 37 /usr/local/bin/mysqld[0x8b279d] /usr/local/bin/mysqld[0x857c89] /usr/local/bin/mysqld[0x882f59] /usr/local/bin/mysqld[0x819c88] /lib64/libpthread.so.0(+0x75f0)[0x7f971050f5f0] /lib64/libc.so.6(clone+0x6d)[0x7f970f30f87d] The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. 110719 11:56:19 mysqld_safe mysqld from pid file /srv/mysqldata/server.pid ended So something is terribly wrong here... maybe the above can help to track down these strange problems.
[25 Aug 2011 0:51]
MySQL Verification Team
Are you tried with our binaries instead of self compiled?. Thanks.
[25 Aug 2011 7:16]
Thomas Förster
No I did not and I'm sorry but I'm currently unable to do so because my slave server was removed.
[17 Jan 2012 19:13]
Sveta Smirnova
Thank you for the feedback. But we still need more information to repeat it on our side: at lease output of SHOW CREATE TABLE ABC and XYZ and compile options you used to build MySQL server. Also if you can test our binaries in your environment this would be better. Please update the report when you are able to provide this information.
[18 Jan 2012 8:34]
Thomas Förster
Compile options: CFLAGS="-march=native -O2 -fomit-frame-pointer -pipe" CXXFLAGS="-march=native -O2 -fomit-frame-pointer -pipe" LDFLAGS="-Wl,--as-needed -Wl,-O1" \ cmake \ -DCMAKE_INSTALL_PREFIX=/usr \ -DDEFAULT_CHARSET=utf8 \ -DDEFAULT_COLLATION=utf8_general_ci \ -DMYSQL_DATADIR=/srv/mysqldata \ -DWITH_COMMENT='MySQL 5.5.10' \ -DWITH_SSL=yes \ -DWITH_PIC=yes \ -DMYSQL_UNIX_ADDR=/tmp/mysql.sock \ -DHAVE_AIO_H=/usr \ -DWITH_ARCHIVE_STORAGE_ENGINE=1 \ -DWITH_BLACKHOLE_STORAGE_ENGINE=1 \ -DWITH_DEBUG=OFF \ -DSYSCONFDIR=/etc/mysql \ -DENABLE_DOWNLOADS=1 \ . Tables: CREATE TABLE "ABC" ( "id" int(10) unsigned NOT NULL AUTO_INCREMENT, "iid" int(10) unsigned NOT NULL DEFAULT '0', "uid" int(10) unsigned NOT NULL DEFAULT '0', "t_last" timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, "t_insert" timestamp NOT NULL DEFAULT '0000-00-00 00:00:00', "ppty" smallint(6) DEFAULT NULL , "rv" datetime DEFAULT NULL , "av" mediumint(9) DEFAULT NULL , "acv" float(11,4) DEFAULT NULL , PRIMARY KEY ("id"), KEY "iid" ("iid"), KEY "uid" ("uid"), KEY "idx_incentive_account_t_insert" ("t_insert"), KEY "idx_sdate" ("ppty"), KEY "idx_iid_uid" ("iid","uid"), KEY "idx_reverse" ("rv"), KEY "idx_alternative_value" ("av"), CONSTRAINT "fk_iid" FOREIGN KEY ("iid") REFERENCES "ii" ("iid") ON UPDATE CASCADE, CONSTRAINT "fk_inc_account_uid" FOREIGN KEY ("uid") REFERENCES "pw" ("uid") ON DELETE CASCADE ON UPDATE CASCADE ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC; CREATE TABLE "ii" ( "iid" int(10) unsigned NOT NULL AUTO_INCREMENT, "pid" int(10) unsigned DEFAULT NULL, "value" mediumint(9) NOT NULL DEFAULT '0', "label_intern" varchar(128) NOT NULL, "label_extern" varchar(128) NOT NULL, "field_code" tinyint(3) unsigned NOT NULL DEFAULT '0', "coid" int(10) unsigned NOT NULL DEFAULT '0', "sid" int(10) unsigned NOT NULL DEFAULT '0', "incentive_type" tinyint(3) unsigned NOT NULL DEFAULT '1', "t_last" timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, "t_insert" timestamp NOT NULL DEFAULT '0000-00-00 00:00:00', "pcid" int(10) unsigned NOT NULL DEFAULT '1', "usage" tinyint(3) unsigned NOT NULL DEFAULT '0' , "cashval" float(11,4) NOT NULL DEFAULT '0.0000', "md5_iid" varchar(16) DEFAULT NULL, PRIMARY KEY ("iid"), UNIQUE KEY "uk_coid_md5" ("coid","md5_iid"), KEY "pid" ("pid"), KEY "coid" ("coid"), KEY "sid" ("sid"), KEY "idx_pid_coid" ("pid","coid"), KEY "idx_pid_sid_coid" ("pid","sid","coid"), KEY "idx_field_code" ("field_code"), KEY "idx_pcid" ("pcid"), KEY "cashval" ("cashval") ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC CREATE TABLE "XYZ" ( "nr" int(10) unsigned NOT NULL AUTO_INCREMENT, "uid" int(10) unsigned NOT NULL, "email" varchar(255) NOT NULL, "lid" varchar(32) NOT NULL, "complete" tinyint(4) NOT NULL DEFAULT '0', "mailstati" tinyint(4) NOT NULL DEFAULT '0', "lsid" tinyint(4) NOT NULL DEFAULT '0', "sid" int(11) NOT NULL DEFAULT '0', "mid" varchar(255) DEFAULT NULL, "extid" int(11) NOT NULL DEFAULT '0', "naccess" int(11) NOT NULL DEFAULT '0', "feedback_box" varchar(255) DEFAULT NULL, "pro_box" varchar(255) DEFAULT NULL, "contra_box" varchar(255) DEFAULT NULL, "valid" tinyint(4) DEFAULT '1', "schedule" tinyint(4) NOT NULL DEFAULT '0', "ip_addr" varchar(32) DEFAULT NULL, "pimp" varchar(32) DEFAULT NULL, "aps_01" tinyint(4) DEFAULT NULL, "aps_02" tinyint(4) DEFAULT NULL, "aps_03" tinyint(4) DEFAULT NULL, "aps_04" tinyint(4) DEFAULT NULL, "aps_05" tinyint(4) DEFAULT NULL, "t_last" timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, "t_insert" timestamp NOT NULL DEFAULT '0000-00-00 00:00:00', "t_start" datetime DEFAULT NULL, "t_stop" datetime DEFAULT NULL, PRIMARY KEY ("nr"), UNIQUE KEY "lid" ("lid"), KEY "uid" ("uid"), KEY "email" ("email"), KEY "complete" ("complete"), KEY "mailstati" ("mailstati"), KEY "sid" ("sid"), KEY "lsid" ("lsid"), KEY "extid" ("extid") ) ENGINE=InnoDB DEFAULT CHARSET=utf8; Unfortunately I can't test the replication again because my company decided to remove the slave server so I'm just working with the master which is running fine since months. Hope the above information helps a bit.
[20 Dec 2012 20:55]
Sveta Smirnova
Thank you for the feedback. In version 5.5.10 this failing query was written to binary log in mixed format as a statement. This could lead to collisions. Now (I tested with current version 5.5.28) such query is written as a row. This makes chances for collision smaller. Actually I can not say if this query was reason for the collision having only information you provided in this report. Issue could cause queries which you performed earlier. But since you removed replication setup and I see improvement of handling such queries in newer versions I close this as "Can't repeat"