Bug #47322 | handler error HA_ERR_KEY_NOT_FOUND; | ||
---|---|---|---|
Submitted: | 15 Sep 2009 12:20 | Modified: | 15 Jul 2010 10:42 |
Reporter: | Yogesh Malik | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server: Row Based Replication ( RBR ) | Severity: | S1 (Critical) |
Version: | 5.1.34 | OS: | Linux |
Assigned to: | CPU Architecture: | Any | |
Tags: | handler error HA_ERR_KEY_NOT_FOUND; |
[15 Sep 2009 12:20]
Yogesh Malik
[15 Sep 2009 13:27]
Valeriy Kravchuk
Please, send your my.cnf from slave. Do you use Mixed or Row-based replication? Please, send the results of: show create table messagesforuser\G from both master and slave. Had you identified statement that leads to this error?
[15 Sep 2009 13:37]
Yogesh Malik
We have Row-based replication. We could not see statement because of row format. Master: mysql> show create table messagesforuser\G *************************** 1. row *************************** Table: messagesforuser Create Table: CREATE TABLE `messagesforuser` ( `MessageId` int(10) unsigned NOT NULL, `UserId` int(10) unsigned NOT NULL, `HasBeenViewed` tinyint(1) unsigned NOT NULL DEFAULT '0', `ViewedTime` datetime DEFAULT '0000-00-00 00:00:00', `Id` int(10) unsigned NOT NULL AUTO_INCREMENT, `ClientTransaction` int(10) unsigned NOT NULL DEFAULT '0', `MessageAttribute` set('SAVE','DELETE','DELETED','TOBEDELETD') NOT NULL DEFAULT 'DELETE', `hasbeenviewedbysip` tinyint(1) unsigned DEFAULT '0', `messageType` set('OTHER','IM') NOT NULL DEFAULT 'OTHER', PRIMARY KEY (`Id`), KEY `FK_messagesforuser_1` (`UserId`), KEY `FK_messagesforuser_2` (`MessageId`), CONSTRAINT `FK_messagesforuser_1` FOREIGN KEY (`UserId`) REFERENCES `userinfo` (`UserId`) ON DELETE CASCADE ON UPDATE CASCADE, CONSTRAINT `FK_messagesforuser_2` FOREIGN KEY (`MessageId`) REFERENCES `messages` (`messageid`) ON DELETE CASCADE ON UPDATE CASCADE ) ENGINE=InnoDB AUTO_INCREMENT=104340990 DEFAULT CHARSET=latin1 COMMENT='InnoDB free: 942080 kB; (`UserId`) REFER `otsdb_2_1/userinfo' Slave: mysql> show create table messagesforuser\G *************************** 1. row *************************** Table: messagesforuser Create Table: CREATE TABLE `messagesforuser` ( `MessageId` int(10) unsigned NOT NULL, `UserId` int(10) unsigned NOT NULL, `HasBeenViewed` tinyint(1) unsigned NOT NULL DEFAULT '0', `ViewedTime` datetime DEFAULT '0000-00-00 00:00:00', `Id` int(10) unsigned NOT NULL AUTO_INCREMENT, `ClientTransaction` int(10) unsigned NOT NULL DEFAULT '0', `MessageAttribute` set('SAVE','DELETE','DELETED','TOBEDELETD') NOT NULL DEFAULT 'DELETE', `hasbeenviewedbysip` tinyint(1) unsigned DEFAULT '0', `messageType` set('OTHER','IM') NOT NULL DEFAULT 'OTHER', PRIMARY KEY (`Id`), KEY `FK_messagesforuser_1` (`UserId`), KEY `FK_messagesforuser_2` (`MessageId`), CONSTRAINT `FK_messagesforuser_1` FOREIGN KEY (`UserId`) REFERENCES `userinfo` (`UserId`) ON DELETE CASCADE ON UPDATE CASCADE, CONSTRAINT `FK_messagesforuser_2` FOREIGN KEY (`MessageId`) REFERENCES `messages` (`messageid`) ON DELETE CASCADE ON UPDATE CASCADE ) ENGINE=InnoDB AUTO_INCREMENT=104226285 DEFAULT CHARSET=latin1 COMMENT='InnoDB free: 942080 kB; (`UserId`) REFER `otsdb_2_1/userinfo' The problem starts when mysql crashed 4 times yesterday and we start getting replication failure error. The crash logs are: 090914 9:03:48 - mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=33554432 read_buffer_size=4194304 max_used_connections=192 max_threads=400 threads_connected=79 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 8228839 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. thd: 0x2aade60c8bc0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x4189ef20 thread_stack 0x30000 /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x8453de] /usr/sbin/mysqld(handle_segfault+0x322)[0x5bca02] /lib64/libpthread.so.0[0x3ffd80e4c0] /usr/sbin/mysqld(_Z9get_fieldP11st_mem_rootP5FieldP6String+0x53)[0x612c93] /usr/sbin/mysqld(_Z17store_schema_procP3THDP8st_tableS2_PKcbS4_+0x727)[0x6baa07] /usr/sbin/mysqld(_Z16fill_schema_procP3THDP10TABLE_LISTP4Item+0x211)[0x6bacc1] /usr/sbin/mysqld(_Z24get_schema_tables_resultP4JOIN23enum_schema_table_state+0x1e0)[0x6b4830] /usr/sbin/mysqld(_ZN4JOIN4execEv+0x430)[0x632780] /usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x16e)[0x 63469e] /usr/sbin/mysqld(_Z13handle_selectP3THDP6st_lexP13select_resultm+0x169)[0x634fb9] /usr/sbin/mysqld[0x5c6ed4] /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x3801)[0x5ccf71] /usr/sbin/mysqld(_Z11mysql_parseP3THDPKcjPS2_+0x1f1)[0x5cefe1] /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xfd0)[0x5cffd0] /usr/sbin/mysqld(_Z10do_commandP3THD+0xe4)[0x5d0564] /usr/sbin/mysqld(handle_one_connection+0x6ef)[0x5c38df] /lib64/libpthread.so.0[0x3ffd806367] /lib64/libc.so.6(clone+0x6d)[0x3ffd0d2f7d] Trying to get some variables. Some pointers may be invalid and cause the dump to abort... thd->query at 0x2bb9dd90 = SHOW PROCEDURE STATUS thd->thread_id=81726 thd->killed=NOT_KILLED The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. mysqld: my_new.cc:51: int __cxa_pure_virtual(): Assertion `! "Aborted: pure virtual method called."' failed. Fatal signal 6 while backtracing 090914 09:03:54 mysqld_safe Number of processes running now: 0 090914 09:03:54 mysqld_safe mysqld restarted 090914 9:03:54 [Warning] The syntax '--log_slow_queries' is deprecated and will be removed in MySQL 7.0. Please use '--slow_query_log'/'--slow_query_log_file ' instead. InnoDB: Log scan progressed past the checkpoint lsn 43 3336075246 090914 9:03:59 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... InnoDB: Doing recovery: scanned up to log sequence number 43 3341317632 ..... InnoDB: Doing recovery: scanned up to log sequence number 43 3631144518 InnoDB: 1 transaction(s) which must be rolled back or cleaned up InnoDB: in total 1 row operations to undo InnoDB: Trx id counter is 0 444721408 090914 9:05:10 InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 InnoDB: Apply batch completed InnoDB: Last MySQL binlog file position 0 1397, file name ./mysql-bin.000005 InnoDB: Starting in background the rollback of uncommitted transactions 090914 9:08:03 InnoDB: Rolling back trx with id 0 444720905, 1 rows to undo 090914 9:08:03 InnoDB: Started; log sequence number 43 3631144518 090914 9:08:03 [Note] Recovering after a crash using mysql-bin 090914 9:08:03 [Note] Starting crash recovery... 090914 9:08:03 [Note] Crash recovery finished.
[15 Sep 2009 13:43]
Yogesh Malik
my.cnf on slave [client] port = 3306 socket = /var/lib/mysql/mysql.sock [mysqld] port = 3306 socket = /var/lib/mysql/mysql.sock lower_case_table_names=1 back_log = 50 max_connections = 400 max_connect_errors = 50 table_cache = 2048 max_allowed_packet = 32M binlog_cache_size = 8M max_heap_table_size = 64M sort_buffer_size = 16M join_buffer_size = 16M thread_cache_size = 30 thread_concurrency = 8 query_cache_size = 64M query_cache_limit = 2M ft_min_word_len = 4 default_table_type = INNODB thread_stack = 192K transaction_isolation = REPEATABLE-READ tmp_table_size = 128M log-bin=mysql-bin log_warnings = 2 binlog_format = ROW log-error = error.log log_slow_queries = slowquery.log long_query_time = 0 log_long_format #Replication server-id = 2 replicate-do-db = otsdb_2_1 key_buffer_size = 32M read_buffer_size = 4M read_rnd_buffer_size = 16M bulk_insert_buffer_size = 64M myisam_sort_buffer_size = 128M myisam_repair_threads = 1 myisam_recover skip-federated skip-name-resolve #skip-bdb innodb_additional_mem_pool_size = 16M innodb_buffer_pool_size = 12G innodb_data_file_path = ibdata1:10M:autoextend innodb_data_home_dir = /db innodb_file_io_threads = 4 innodb_thread_concurrency = 32 innodb_flush_log_at_trx_commit = 0 innodb_log_buffer_size = 16M innodb_log_file_size = 1256M innodb_log_files_in_group = 2 innodb_log_group_home_dir = /db/ innodb_max_dirty_pages_pct = 90 innodb_lock_wait_timeout = 120 innodb_file_per_table = 1 #datadir=/db [mysqldump] quick [mysql] no-auto-rehash [isamchk] key_buffer = 512M sort_buffer_size = 512M read_buffer = 8M write_buffer = 8M [myisamchk] key_buffer = 512M sort_buffer_size = 512M read_buffer = 8M write_buffer = 8M [mysqlhotcopy] interactive-timeout [mysqld_safe] open-files-limit = 8192
[15 Sep 2009 14:46]
Sveta Smirnova
Thank you for the feedback. Crash looks similar to bug #41726. Did you do binary upgrade to 5.1.34? If so please run mysql_upgrade. Regarding to replication error please run ALTER TABLE messagesforuser ENGINE=INNODB on both master and slave and inform us if problem still exists after this.
[15 Sep 2009 14:46]
Sveta Smirnova
Replication bug can be related to Bug #47312
[16 Sep 2009 6:34]
Yogesh Malik
I upgrade my server more than one month back, is there any possibility of crash after so many days of upgrade and when you forget mysql_upgrade while upgrading? Following are my steps for upgrade i followed: Old Version: 5.0.67 New Version: 5.1.34 Take a database dump using mysqldump --opt db_name > dbname.sql Remove all RPM from destination server installl new mysql RPMs restart server with old mysql folder Disable foreign key, unique key checks on destination server restore db from dump - mysql -uroot -p db_name < dn_name.sql
[16 Sep 2009 7:52]
Sveta Smirnova
Thank you for the feedback. > restart server with old mysql folder If you didn't run mysql_upgrade you can still have broken system tables. They can run smoothly for a while until you call particular statement which uses code changed from 5.0 to 5.1 So, please, run mysql_upgrade and ALTER TABLE, try and inform us if problem still exists.
[16 Sep 2009 9:03]
Yogesh Malik
Regarding replication: I didn't change any thing on slave specially related to disabling keys on slave as mentioned in bug http://bugs.mysql.com/bug.php?id=47312. As mentioned earlier the replication failure happened just after mysql crash. There must be some other cause of replication failure in our case. Does changing replication format from ROW to STATEMENT resolve the issue forever?
[17 Sep 2009 6:48]
Sveta Smirnova
Thank you for the feedback. > I didn't change any thing on slave specially related to disabling keys on slave as > mentioned in bug http://bugs.mysql.com/bug.php?id=47312. > As mentioned earlier the replication failure happened just after mysql crash. Correct. But this still can be broken because you have modified keys on slave due to crash. This is why I ask you to rebuild tables. In this case will make sense to wait when bug #47312 is fixed, then examine if failure still occurs. Otherwise this is new bug and we should find a way to repeat it.
[17 Sep 2009 8:08]
Yogesh Malik
i rebuilt all my tables by mysql_upgrade however that doesn't resolve the issue and still same replication error is coming. Its our production server so i rebuilt it from backup finally. as asked in my last reply, does changing replication format from row to mixed or statement makes any difference?
[17 Sep 2009 16:59]
Sveta Smirnova
Thank you for the feedback. mysql_upgrade doesn't rebuild InnoDB tables, you need to do ALTER TABLE ... ENGINE=INNODB Regarding if statement-based replication solves replication problem we can not say it for sure until we know cause of the bug. Probably yes, but be aware of statement-based limitations.
[17 Sep 2009 17:36]
Sveta Smirnova
Generic tests does not prove crash described leads to HA_ERR_KEY_NOT_FOUND error, so, please, try ALTER TABLE suggestion and inform us if it solves the problem.
[17 Oct 2009 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[21 Nov 2009 16:52]
Valeriy Kravchuk
All reporters: please, take last Sveta's suggestion about ALTER TABLE into account, and check if you'll still have this problem with a newer version, 5.1.41.
[22 Dec 2009 0:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[22 Dec 2009 10:13]
MySQL Verification Team
Still feedback is needed.
[23 Jan 2010 0:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[24 Mar 2010 10:36]
shishu naik
We are still getting the same error of HA_ERR_KEY_NOT_FOUND and it may not be the same error which is reported in this bug but it has the similarity. 100310 9:46:21 [ERROR] Slave SQL: Could not execute Delete_rows event on table forum_se.phpbb_sessions; Can't find record in 'phpbb_sessions', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000005, end_log_pos 97808705, Error_code: 1032 100310 9:46:21 [Warning] Slave: Can't find record in 'phpbb_sessions' Error_code: 1032 100310 9:46:21 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000005' position 97808445 We have a 3 way circular replication and replication breaks frequently with the above errors. If the seconds behind master is greater than zero in one server , does that means some queries caused deadlock and the query is not replicated to other servers.In the mean time a users deletes the records from the table and the replication query could not find the record ,resulting in above errors. We thought upgrading from 5.1.39 to new version 5.1.44 will solve the issue,but the error repeats in that version also??
[17 Jun 2010 5:25]
Sveta Smirnova
Vojtech, thank you for the feedback. Please also send us trigger definition and slave configuration file.
[18 Jun 2010 22:33]
Roel Van de Paar
See bug #44360, bug #39648
[15 Jul 2010 6:56]
Sveta Smirnova
Vojtech, thank you for the feedback. We have verified bug #40278 fixed in 5.5 series in which same problem occurred if user mix InnoDB and MyISAM tables. What is the storage engine of tables you update in trigger?
[15 Jul 2010 9:25]
Vojtech Kurka
Sveta, I have reviewed the code and I found a stored procedure, which is called in this trigger and uses a temporary table. Could it be the problem, when the transaction is rolled back? The SP code is attached below.
[15 Jul 2010 9:25]
Vojtech Kurka
stored procedure with a temporary table
Attachment: stored_procedure.sql (application/octet-stream, text), 927 bytes.
[15 Jul 2010 9:52]
Sveta Smirnova
Thank you for the update, Vojtech. Yes, this can make things more complicated, especially if you use MyISAM tables either in trigger on in procedure. In the configuration file you have no --default-storage-engine option, so my guess is temporary table is MyISAM. So if SP stops before DROP TABLE call temporary table will still exist and have data which can leads to various error. In this case error would be not symptom of the bug.
[15 Jul 2010 10:22]
Vojtech Kurka
Yes, the temporary table is indeed created as MyISAM. Does it make any difference, if I use the MEMORY engine explicitly? I assume mixing any non-transactional tables with innodb tables will cause the same problem, am I right? Then the workaround should be don't mix non-transactional tables with transactional, even if they are temporary. Thank you, Sveta
[15 Jul 2010 10:42]
Sveta Smirnova
Thank you for the feedback. Yes, switching to MEMORY does not help, because this is not transactional storage engine. So in your case this is not a bug. But as report was originally reported by another person I close it as "Can't repeat". If somebody is able to provide repeatable test case feel free to send test case and reopen the report.
[15 Jul 2010 10:50]
Vojtech Kurka
Sveta, I disagree. If the master silently executes everything without any errors and the binlog is corrupted it this way, this is a bug. RBR should never allow this to happen. I know mixing transactional and non-transactional tables is bad, but the master should never execute such transaction, if it corrupts the binlog.
[15 Jul 2010 11:02]
Sveta Smirnova
Vojtech, this case verified and fixed in 5.5 series. See bug #40278 and its duplicates.
[15 Jul 2010 11:20]
Vojtech Kurka
OK, thank you. However, it should be noted in documentation for 5.1, that row-based binary log can't be used for safe point in time recovery. Incremental backups done using RBR are not reliable.