Bug #73767 "Unable to purge a record" with ibuf and varchar fields containing spaces
Submitted: 29 Aug 2014 6:18 Modified: 24 Nov 2014 13:35
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.6.20 OS:Any
Assigned to:

[29 Aug 2014 6:18] Shane Bester
Description:
Affects 5.6 and 5.7.

Version: '5.7.5-m15'  socket: ''  port: 3306  MySQL Community Server (GPL)
2014-08-29T05:35:08.548198Z 0 [ERROR] InnoDB: Unable to purge a record
InnoDB: tuple DATA TUPLE: 4 fields;
 0: len 9; hex 000000000000000020; asc          ;;
 1: len 9; hex 000020202020202020; asc          ;;
 2: len 9; hex 002020202020202020; asc          ;;
 3: len 9; hex 000000002020202020; asc          ;;

InnoDB: record PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 9; hex 000000000000000020; asc          ;;
 1: len 9; hex 000020202020202020; asc          ;;
 2: len 9; hex 002020202020202020; asc          ;;
 3: len 9; hex 000000002020202020; asc          ;;

space 79 offset 216 (346 records, index id 172)
InnoDB: Submit a detailed bug report to http://bugs.mysql.com

How to repeat:
On a fast machine, start release build of mysqld with options such as this:

--loose-innodb-flush-log-at-trx-commit=0 
--loose-innodb-buffer-pool-size=14M 
--loose-innodb-io-capacity-max=100000 
--loose-innodb-io-capacity=10000 
--loose-innodb-stats-persistent=0
--loose-performance-schema=0

-------------------------
drop table if exists t1;
create table t1(a char(9) not null,b char(9) not null,c char(9) not null,d char(9),
unique key(a,b,c),key(b,a), key(b,c),key(b,d))engine=innodb default charset latin1;
drop procedure if exists p1;
drop function if exists f1;

delimiter $
create procedure p1()
begin
  declare i bigint unsigned default 0;
  declare continue handler for sqlexception begin end;
  repeat
    set @a:=f1(9);set @b:=f1(9);set @c:=f1(9);set @d:=f1(9);
    replace into t1 set a=@a,b=@b,c=@c,d=@d;
    set i:=i+1;
    if i mod 10000   = 0 then select now(),i; end if;
    if i mod 1000000 = 0 then delete from t1; end if;
  until 1=2 end repeat;  
end $

create function f1(len int) returns char(9)
begin
  if rand() < 0.2 then return repeat(floor(rand()*255),floor(rand()*len));
  else return repeat(0x20,floor(rand()*len)); end if;
end ;   $

delimiter ;

call p1();  # call in multiple threads (e.g. >=4).
--------------------------------
[29 Aug 2014 6:32] Shane Bester
I should note that having multiple cores helps a lot to repeat the issue.
[29 Aug 2014 9:22] Shane Bester
Debug build asserts on mysql-trunk:

InnoDB: Assertion failure in thread 5720 in file ibuf0ibuf.cc line 4304
InnoDB: Failing assertion: 0
InnoDB: We intentionally generate a memory trap.

mysqld-debug.exe!my_sigabrt_handler()[my_thr_init.c:392]
mysqld-debug.exe!raise()[winsig.c:594]
mysqld-debug.exe!abort()[abort.c:82]
mysqld-debug.exe!ut_dbg_assertion_failed()[ut0dbg.cc:69]
mysqld-debug.exe!ibuf_delete()[ibuf0ibuf.cc:4304]
mysqld-debug.exe!ibuf_merge_or_delete_for_page()[ibuf0ibuf.cc:4793]
mysqld-debug.exe!buf_page_io_complete()[buf0buf.cc:5553]
mysqld-debug.exe!fil_aio_wait()[fil0fil.cc:5377]
mysqld-debug.exe!io_handler_thread()[srv0start.cc:288]
[8 Sep 2014 16:14] Shane Bester
related: http://bugs.mysql.com/bug.php?id=73839
[9 Sep 2014 13:26] Marko Mäkelä
Posted by developer:
 
If the "Description" corresponds to "How To Repeat", then the problem should occur with key(b,d), whose records contain the field (b,d,a,c). The PRIMARY KEY would be the UNIQUE KEY(a,b,c) because all of a,b,c are NOT NULL.

The interesting thing here is that the fields do fully match between the DATA TUPLE (constructed from the buffered entry) the PHYSICAL RECORD (on the secondary index page).

The problem seems to be the info_bits=0 (it should be info_bits=32, indicating that the record was marked for deletion):

		if (page_get_n_recs(page) <= 1
		    || !(REC_INFO_DELETED_FLAG
			 & rec_get_info_bits(rec, page_is_comp(page)))) {

(Theoretically, this could also be the only record on the user page. We do not allow empty pages, unless the whole table is empty, and the index consists of an empty root page.)
[12 Sep 2014 6:39] zhai weixiang
Hi ,  I encountered this problem on production environment today, with error message bellow in alert.log
2014-09-12 14:00:50 2b0349207700  InnoDB: unable to purge a record
InnoDB: tuple DATA TUPLE: 3 fields;
 0: len 8; hex 8000000926832f37; asc     & /7;;
 1: len 1; hex 81; asc  ;;
 2: len 8; hex 8000001017f75f78; asc       _x;;

InnoDB: record PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 8; hex 8000000926832f37; asc     & /7;;
 1: len 1; hex 81; asc  ;;
 2: len 8; hex 8000001017f75f78; asc       _x;;

space 451 offset 11037 (273 records, index id 951)

I am not sure if this message is safe and our dba can ignore it . Any advice ?
[15 Sep 2014 8:01] zhai weixiang
Besides, is this bug related to this change log entry of mysql 5.7.5 ?

   * InnoDB: In rare cases, the purge process would attempt to delete a secondary index record that was not marked for deletion, resulting in an inconsistent secondary index. (Bug #19138298)
[25 Sep 2014 11:30] Marko Mäkelä
Hi Zhai,

The internal bug that you mention is also know as the public bug
Bug#70214 Record in index was not found on rollback, trying to insert

There is no fix for that yet. If this bug is mentioned in the 5.7.5 changelog, then it should be removed from there.

We do have very hard-to-reproduce bugs that cause an inconsistency between the clustered index and a secondary index of a table. Some of this logical corruption could be caused by change buffering, which is very nondeterministic in nature (the page must be absent from the buffer pool at the right time).

Note that this Bug#73767 is noticing the corruption during the change buffer merge. We do not know what is causing the corruption.

Previously seen signs of corruption include the following:

* CHECK TABLE returning different counts for different indexes
(orphan record in secondary index or in a clustered index)
* Purge is trying to remove a record that is not delete-marked
* Delete-mark mismatch between clustered and secondary index
[25 Sep 2014 14:31] Marko Mäkelä
The 5.7.5 change log contained a wrong bug number. The internal bug that was fixed (and may have contributed to index corruption) is:

Bug#18631496 MULTIPLE PURGE THREADS - CORRUPTION - ERROR IN SEC INDEX ENTRY UPDATE
[26 Sep 2014 18:59] Shane Bester
Just to clarify.  The testcase here still affects 5.6.21:

Version: '5.6.21-enterprise-commercial-advanced'  socket: ''  port: 3306  My
2014-09-26 20:54:52 8736 [Warning] Host name 'i7.home.gateway' could not be
2014-09-26 20:55:34 1068  InnoDB: unable to purge a record
InnoDB: tuple DATA TUPLE: 4 fields;
 0: len 9; hex 202020202020202020; asc          ;;
 1: len 9; hex 202020202020202020; asc          ;;
 2: len 9; hex 202020202020202020; asc          ;;
 3: len 9; hex 202020202020202020; asc          ;;

InnoDB: record PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 9; hex 202020202020202020; asc          ;;
[27 Sep 2014 1:08] zhai weixiang
Bad news.. :) So I really care about this: Do we have to restore the dataset when we got this error message in alert log ?
[14 Nov 2014 6:48] Marko Mäkelä
Posted by developer:
 
This is a possible root cause of
Bug#13413535 61104: InnoDB: Failing assertion: page_get_n_recs(page) > 1
[24 Nov 2014 13:35] Daniel Price
Posted by developer:
 
Fixed as of the upcoming 5.6.23, 5.7.6 releases, and here's the changelog entry:

With change buffering enabled, a buffered sequence of operations that
should not have been buffered resulted in an "Unable to purge a record"
error. 

Thank you for the bug report.
[12 Feb 2015 12:33] Laurynas Biveinis
$ git show -s ec369cb
commit ec369cb4f363161dfbbbd662b20763b54808b7d1
Author: Annamalai Gurusami <annamalai.gurusami@oracle.com>
Date:   Mon Nov 17 13:33:51 2014 +0530

    Bug #19528825 "UNABLE TO PURGE A RECORD" WITH IBUF AND VARCHAR FIELDS
    CONTAINING SPACES
    
    Problem:
    
    This problem happens only if the change buffer is enabled.  It does not
    happen if the change buffer is disabled.  Analysis showed that the cause
    was the buffering of the following sequence of operations for a same
    record - IBUF_OP_DELETE_MARK, IBUF_OP_INSERT, IBUF_OP_DELETE.  Because
    of this sequence, while merging the change buffer entries into the page,
    the delete operation (also called purge) found the "record to be purged"
    without a delete mark.
    
    Solution:
    
    The above sequence of operations must not be bufferred.  This sequence is
    avoided when change buffer is disabled, because of latching of secondary
    index page.  So it must be avoided when change buffer is enabled by using
    the "page watch" properly.
    
    Once the purge thread has kept a watch for a page in the buffer pool,
    a concurrent insert operation on the same page must take this into account,
    and not buffer the operation.
    
    rb#7293 approved by Marko.
[10 Apr 2015 7:31] Umesh Shastry
Bug #75680 marked duplicate of this
[6 May 2015 8:27] Shane Bester
Followup:
Bug 20796566 - INNODB: ERROR: INSERT BUFFER INSERT FAILS CANNOT INSERT INDEX
RECORD