MySQL Bugs: #47322: handler error HA_ERR_KEY_NOT

Bug #47322	handler error HA_ERR_KEY_NOT_FOUND;
Submitted:	15 Sep 2009 12:20	Modified:	15 Jul 2010 10:42
Reporter:	Yogesh Malik	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: Row Based Replication ( RBR )	Severity:	S1 (Critical)
Version:	5.1.34	OS:	Linux
Assigned to:		CPU Architecture:	Any
Tags:	handler error HA_ERR_KEY_NOT_FOUND;

Description:
when server load increase, replication stopped with following error. we skip sql slave counter but not resolved

 Last_SQL_Error: Could not execute Update_rows event on table otsdb_2_1.messagesforuser; Can't find record in 'messagesforuser', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000268, end_log_pos 631648349

How to repeat:
its happening on our server

Please, send your my.cnf from slave. Do you use Mixed or Row-based replication?

Please, send the results of:

show create table messagesforuser\G

from both master and slave. 

Had you identified statement that leads to this error?

We have Row-based replication.
We could not see statement because of row format.

Master:

mysql> show create table messagesforuser\G
*************************** 1. row ***************************
       Table: messagesforuser
Create Table: CREATE TABLE `messagesforuser` (
  `MessageId` int(10) unsigned NOT NULL,
  `UserId` int(10) unsigned NOT NULL,
  `HasBeenViewed` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `ViewedTime` datetime DEFAULT '0000-00-00 00:00:00',
  `Id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `ClientTransaction` int(10) unsigned NOT NULL DEFAULT '0',
  `MessageAttribute` set('SAVE','DELETE','DELETED','TOBEDELETD') NOT NULL DEFAULT 'DELETE',
  `hasbeenviewedbysip` tinyint(1) unsigned DEFAULT '0',
  `messageType` set('OTHER','IM') NOT NULL DEFAULT 'OTHER',
  PRIMARY KEY (`Id`),
  KEY `FK_messagesforuser_1` (`UserId`),
  KEY `FK_messagesforuser_2` (`MessageId`),
  CONSTRAINT `FK_messagesforuser_1` FOREIGN KEY (`UserId`) REFERENCES `userinfo` (`UserId`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `FK_messagesforuser_2` FOREIGN KEY (`MessageId`) REFERENCES `messages` (`messageid`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=104340990 DEFAULT CHARSET=latin1 COMMENT='InnoDB free: 942080 kB; (`UserId`) REFER `otsdb_2_1/userinfo'

Slave:

mysql> show create table messagesforuser\G
*************************** 1. row ***************************
       Table: messagesforuser
Create Table: CREATE TABLE `messagesforuser` (
  `MessageId` int(10) unsigned NOT NULL,
  `UserId` int(10) unsigned NOT NULL,
  `HasBeenViewed` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `ViewedTime` datetime DEFAULT '0000-00-00 00:00:00',
  `Id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `ClientTransaction` int(10) unsigned NOT NULL DEFAULT '0',
  `MessageAttribute` set('SAVE','DELETE','DELETED','TOBEDELETD') NOT NULL DEFAULT 'DELETE',
  `hasbeenviewedbysip` tinyint(1) unsigned DEFAULT '0',
  `messageType` set('OTHER','IM') NOT NULL DEFAULT 'OTHER',
  PRIMARY KEY (`Id`),
  KEY `FK_messagesforuser_1` (`UserId`),
  KEY `FK_messagesforuser_2` (`MessageId`),
  CONSTRAINT `FK_messagesforuser_1` FOREIGN KEY (`UserId`) REFERENCES `userinfo` (`UserId`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `FK_messagesforuser_2` FOREIGN KEY (`MessageId`) REFERENCES `messages` (`messageid`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=104226285 DEFAULT CHARSET=latin1 COMMENT='InnoDB free: 942080 kB; (`UserId`) REFER `otsdb_2_1/userinfo'

The problem starts when mysql crashed 4 times yesterday and we start getting replication failure error.

The crash logs are:

090914  9:03:48 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=33554432
read_buffer_size=4194304
max_used_connections=192
max_threads=400
threads_connected=79
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 8228839 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x2aade60c8bc0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x4189ef20 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x8453de]
/usr/sbin/mysqld(handle_segfault+0x322)[0x5bca02]
/lib64/libpthread.so.0[0x3ffd80e4c0]
/usr/sbin/mysqld(_Z9get_fieldP11st_mem_rootP5FieldP6String+0x53)[0x612c93]
/usr/sbin/mysqld(_Z17store_schema_procP3THDP8st_tableS2_PKcbS4_+0x727)[0x6baa07]
/usr/sbin/mysqld(_Z16fill_schema_procP3THDP10TABLE_LISTP4Item+0x211)[0x6bacc1]
/usr/sbin/mysqld(_Z24get_schema_tables_resultP4JOIN23enum_schema_table_state+0x1e0)[0x6b4830]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x430)[0x632780]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x16e)[0x
63469e]
/usr/sbin/mysqld(_Z13handle_selectP3THDP6st_lexP13select_resultm+0x169)[0x634fb9]
/usr/sbin/mysqld[0x5c6ed4]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x3801)[0x5ccf71]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPKcjPS2_+0x1f1)[0x5cefe1]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xfd0)[0x5cffd0]
/usr/sbin/mysqld(_Z10do_commandP3THD+0xe4)[0x5d0564]
/usr/sbin/mysqld(handle_one_connection+0x6ef)[0x5c38df]
/lib64/libpthread.so.0[0x3ffd806367]
/lib64/libc.so.6(clone+0x6d)[0x3ffd0d2f7d]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x2bb9dd90 = SHOW PROCEDURE STATUS
thd->thread_id=81726
thd->killed=NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
mysqld: my_new.cc:51: int __cxa_pure_virtual(): Assertion `! "Aborted: pure virtual method called."' failed.
Fatal signal 6 while backtracing
090914 09:03:54 mysqld_safe Number of processes running now: 0
090914 09:03:54 mysqld_safe mysqld restarted
090914  9:03:54 [Warning] The syntax '--log_slow_queries' is deprecated and will be removed in MySQL 7.0. Please use '--slow_query_log'/'--slow_query_log_file
' instead.
InnoDB: Log scan progressed past the checkpoint lsn 43 3336075246
090914  9:03:59  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 43 3341317632
.....
InnoDB: Doing recovery: scanned up to log sequence number 43 3631144518
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 1 row operations to undo
InnoDB: Trx id counter is 0 444721408
090914  9:05:10  InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98
 99
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 1397, file name ./mysql-bin.000005
InnoDB: Starting in background the rollback of uncommitted transactions
090914  9:08:03  InnoDB: Rolling back trx with id 0 444720905, 1 rows to undo
090914  9:08:03  InnoDB: Started; log sequence number 43 3631144518
090914  9:08:03 [Note] Recovering after a crash using mysql-bin
090914  9:08:03 [Note] Starting crash recovery...
090914  9:08:03 [Note] Crash recovery finished.

my.cnf on slave

[client]
port            = 3306
socket          = /var/lib/mysql/mysql.sock
[mysqld]
port            = 3306
socket          = /var/lib/mysql/mysql.sock
lower_case_table_names=1
back_log = 50
max_connections = 400
max_connect_errors = 50
table_cache = 2048
max_allowed_packet = 32M
binlog_cache_size = 8M
max_heap_table_size = 64M
sort_buffer_size = 16M
join_buffer_size = 16M
thread_cache_size = 30
thread_concurrency = 8
query_cache_size = 64M
query_cache_limit = 2M
ft_min_word_len = 4
default_table_type = INNODB
thread_stack = 192K
transaction_isolation = REPEATABLE-READ
tmp_table_size = 128M
log-bin=mysql-bin
log_warnings = 2
binlog_format = ROW
log-error = error.log
log_slow_queries = slowquery.log
long_query_time = 0
log_long_format

#Replication
server-id = 2
replicate-do-db = otsdb_2_1
key_buffer_size = 32M
read_buffer_size = 4M
read_rnd_buffer_size = 16M
bulk_insert_buffer_size = 64M
myisam_sort_buffer_size = 128M
myisam_repair_threads = 1
myisam_recover
skip-federated
skip-name-resolve
#skip-bdb
innodb_additional_mem_pool_size = 16M
innodb_buffer_pool_size = 12G
innodb_data_file_path = ibdata1:10M:autoextend
innodb_data_home_dir = /db
innodb_file_io_threads = 4
innodb_thread_concurrency = 32
innodb_flush_log_at_trx_commit = 0
innodb_log_buffer_size = 16M
innodb_log_file_size = 1256M
innodb_log_files_in_group = 2
innodb_log_group_home_dir = /db/
innodb_max_dirty_pages_pct = 90
innodb_lock_wait_timeout = 120
innodb_file_per_table = 1
#datadir=/db
[mysqldump]
quick
[mysql]
no-auto-rehash
[isamchk]
key_buffer = 512M
sort_buffer_size = 512M
read_buffer = 8M
write_buffer = 8M
[myisamchk]
key_buffer = 512M
sort_buffer_size = 512M
read_buffer = 8M
write_buffer = 8M
[mysqlhotcopy]
interactive-timeout
[mysqld_safe]
open-files-limit = 8192

Thank you for the feedback.

Crash looks similar to bug #41726. Did you do binary upgrade to 5.1.34? If so please run mysql_upgrade.

Regarding to replication error please run ALTER TABLE messagesforuser ENGINE=INNODB on both master and slave and inform us if problem still exists after this.

Replication bug can be related to Bug #47312

I upgrade my server more than one month back, is there any possibility of crash after so many days of upgrade and when you forget mysql_upgrade while upgrading?

Following are my steps for upgrade i followed:

Old Version: 5.0.67
New Version: 5.1.34

Take a database dump using mysqldump --opt db_name > dbname.sql
Remove all RPM from destination server
installl new mysql RPMs
restart server with old mysql folder
Disable foreign key, unique key checks on destination server
restore db from dump - mysql -uroot -p db_name < dn_name.sql

Thank you for the feedback.

> restart server with old mysql folder

If you didn't run mysql_upgrade you can still have broken system tables. They can run smoothly for a while until you call particular statement which uses code changed from 5.0 to 5.1

So, please, run mysql_upgrade and ALTER TABLE, try and inform us if problem still exists.

Regarding replication:

I didn't change any thing on slave specially related to disabling keys on slave as mentioned in bug http://bugs.mysql.com/bug.php?id=47312.
As mentioned earlier the replication failure happened just after mysql crash.

There must be some other cause of replication failure in our case.

Does changing replication format from ROW to STATEMENT resolve the issue forever?

Thank you for the feedback.

> I didn't change any thing on slave specially related to disabling keys on slave as
> mentioned in bug http://bugs.mysql.com/bug.php?id=47312.
> As mentioned earlier the replication failure happened just after mysql crash.

Correct. But this still can be broken because you have modified keys on slave due to crash. This is why I ask you to rebuild tables. In this case will make sense to wait when bug #47312 is fixed, then examine if failure still occurs. Otherwise this is new bug and we should find a way to repeat it.

i rebuilt all my tables by mysql_upgrade however that doesn't resolve the issue and still same replication error is coming.
Its our production server so i rebuilt it from backup finally.

as asked in my last reply, does changing replication format from row to mixed or statement makes any difference?

Thank you for the feedback.

mysql_upgrade doesn't rebuild InnoDB tables, you need to do ALTER TABLE ... ENGINE=INNODB

Regarding if statement-based replication solves replication problem we can not say it for sure until we know cause of the bug. Probably yes, but be aware of statement-based limitations.

Generic tests does not prove crash described leads to HA_ERR_KEY_NOT_FOUND error, so, please, try ALTER TABLE suggestion and inform us if it solves the problem.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

All reporters: please, take last Sveta's suggestion about ALTER TABLE into account, and check if you'll still have this problem with a newer version, 5.1.41.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

Still feedback is needed.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

We are still getting the same error of HA_ERR_KEY_NOT_FOUND
and it may not be the same error which is reported in this bug but it has the similarity.

100310  9:46:21 [ERROR] Slave SQL: Could not execute Delete_rows event on table forum_se.phpbb_sessions; Can't find record in 'phpbb_sessions', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000005, end_log_pos 97808705, Error_code: 1032
100310  9:46:21 [Warning] Slave: Can't find record in 'phpbb_sessions' Error_code: 1032
100310  9:46:21 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000005' position 97808445

We have a 3 way circular replication and replication breaks frequently with the above errors.

If the seconds behind master is greater than zero in one server , does that means some queries caused deadlock and the query is not replicated to other servers.In the mean time a users deletes the records from the table and the replication query could not find the record ,resulting in above errors.
 

We thought upgrading from 5.1.39 to new version 5.1.44 will solve the issue,but the error repeats in that version also??

Vojtech,

thank you for the feedback. Please also send us trigger definition and slave configuration file.

See bug #44360, bug #39648

Vojtech,

thank you for the feedback. We have verified bug #40278 fixed in 5.5 series in which same problem occurred if user mix InnoDB and MyISAM tables. What is the storage engine of tables you update in trigger?

Sveta, 

I have reviewed the code and I found a stored procedure, which is called in this trigger and uses a temporary table. Could it be the problem, when the transaction is rolled back?

The SP code is attached below.

stored procedure with a temporary table

Attachment: stored_procedure.sql (application/octet-stream, text), 927 bytes.

Thank you for the update, Vojtech.

Yes, this can make things more complicated, especially if you use MyISAM tables either in trigger on in procedure. In the configuration file you have no --default-storage-engine option, so my guess is temporary table is MyISAM. So if SP stops before DROP TABLE call temporary table will still exist and have data which can leads to various error. In this case error would be not symptom of the bug.

Yes, the temporary table is indeed created as MyISAM. Does it make any difference, if I use the MEMORY engine explicitly? I assume mixing any non-transactional tables with innodb tables will cause the same problem, am I right?

Then the workaround should be don't mix non-transactional tables with transactional, even if they are temporary.

Thank you, Sveta

Thank you for the feedback.

Yes, switching to MEMORY does not help, because this is not transactional storage engine. So in your case this is not a bug.

But as report was originally reported by another person I close it as "Can't repeat". If somebody is able to provide repeatable test case feel free to send test case and reopen the report.

Sveta, I disagree. 

If the master silently executes everything without any errors and the binlog is corrupted it this way, this is a bug. RBR should never allow this to happen. 

I know mixing transactional and non-transactional tables is bad, but the master should never execute such transaction, if it corrupts the binlog.

Vojtech,

this case verified and fixed in 5.5 series. See bug #40278 and its duplicates.

OK, thank you.

However, it should be noted in documentation for 5.1, that row-based binary log can't be used for safe point in time recovery. Incremental backups done using RBR are not reliable.