Bug #68116 InnoDB monitor may hit an assertion error in buf_page_get_gen in debug builds
Submitted: 18 Jan 2013 9:32 Modified: 10 Dec 2014 13:11
Reporter: Laurynas Biveinis (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.1, 5.5, 5.6 OS:Any
Assigned to: CPU Architecture:Any

[18 Jan 2013 9:32] Laurynas Biveinis
Description:
lock_print_info_all_transactions has

			lock_mutex_exit_kernel();

			mtr_start(&mtr);

			buf_page_get_with_no_latch(space, zip_size,
						   page_no, &mtr);

			mtr_commit(&mtr);

			load_page_first = FALSE;

			lock_mutex_enter_kernel();

A page read happens with nothing locked, thus it might get freed meanwhile, hitting an assert in buf_page_get_gen

#if defined UNIV_DEBUG_FILE_ACCESSES || defined UNIV_DEBUG
	ut_a(mode == BUF_GET_POSSIBLY_FREED
	     || !block->page.file_page_was_freed);
#endif

It seems that similar issues have been fixed elsewhere by the commit below.

revno: 3400
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: mysql-5.5
timestamp: Tue 2011-05-24 23:08:22 +0300
message:
  Bug#12584374 LOCK_VALIDATE TRIPS ASSERTION !BLOCK->PAGE.FILE_PAGE_WAS_FREE

  Fix a deadlock in the initial patch. lock_validate() must not hold the
  lock system mutex while s-latching a block, because some functions,
  such as lock_rec_convert_impl_to_expl(), may be already holding an x-latch
  on the block that lock_validate() is interested in while attempting to
  acquire the lock system mutex.

  This deadlock was not caught by UNIV_SYNC_DEBUG because of
  buf_block_dbg_add_level(block, SYNC_NO_ORDER_CHECK).

More information at https://bugs.launchpad.net/percona-server/+bug/1100178

How to repeat:
RQG (on XtraDB), please let us know if you need more details.

Suggested fix:
Replace buf_page_get_with_no_latch call with buf_page_get_gen(RW_NO_LATCH, BUF_GET_POSSIBLY_FREED) ?
[18 Jan 2013 14:05] Laurynas Biveinis
A stacktrace from the linked bug. It's from Percona Server, but I don't think PS and MySQL differ here:

InnoDB: Failing assertion: mode == BUF_GET_POSSIBLY_FREED || !block->page.file_page_was_freed

Thread 24 (LWP 10333):
+bt
#0 0x0000003da180c60c in pthread_kill () from /lib64/libpthread.so.0
#1 0x00000000007dc4a8 in my_write_core (sig=6) at /ssd/bug1086680/Percona-Server-5.5.28-rel29.2/mysys/stacktrace.c:433
#2 0x00000000006b1b68 in handle_fatal_signal (sig=6) at /ssd/bug1086680/Percona-Server-5.5.28-rel29.2/sql/signal_handler.cc:249
#3 <signal handler called>
#4 0x0000003da1435935 in raise () from /lib64/libc.so.6
#5 0x0000003da14370e8 in abort () from /lib64/libc.so.6
#6 0x00000000008e3014 in buf_page_get_gen (space=space@entry=621, zip_size=zip_size@entry=8192, offset=offset@entry=146, rw_latch=rw_latch@entry=3, guess=guess@entry=0x0, mode=mode@entry=14, file=file@entry=0xb8e6d0 "/ssd/bug1086680/Percona-Server-5.5.28-rel29.2/storage/innobase/lock/lock0lock.c", line=line@entry=4664, mtr=mtr@entry=0x170d2800) at /ssd/bug1086680/Percona-Server-5.5.28-rel29.2/storage/innobase/buf/buf0buf.c:2909
#7 0x0000000000973b8a in lock_print_info_all_transactions (file=file@entry=0x5d0e6f0) at /ssd/bug1086680/Percona-Server-5.5.28-rel29.2/storage/innobase/lock/lock0lock.c:4663
#8 0x00000000008708c2 in srv_printf_innodb_monitor (file=0x5d0e6f0, nowait=nowait@entry=1, trx_start=trx_start@entry=0x0, trx_end=trx_end@entry=0x0) at /ssd/bug1086680/Percona-Server-5.5.28-rel29.2/storage/innobase/srv/srv0srv.c:2263
#9 0x0000000000871592 in srv_monitor_thread (arg=<optimized out>) at /ssd/bug1086680/Percona-Server-5.5.28-rel29.2/storage/innobase/srv/srv0srv.c:2580
#10 0x0000003da1807d14 in start_thread () from /lib64/libpthread.so.0
#11 0x0000003da14f168d in clone () from /lib64/libc.so.6
[22 Jan 2013 16:32] Sveta Smirnova
Thank you for the report.

> RQG (on XtraDB), please let us know if you need more details.

Yes, please, send us RQG test.
[28 Jan 2013 6:37] Roel Van de Paar
Sveta, it's actually not an RQG test anymore. 

1. Build server as debug
2. bzr branch lp:percona-qa
3. use this script to start mysqld (Percona example, feel free to change):

cd /ssd/Percona-Server-5.5.28-rel29.2-394-debug.Linux.x86_64/mysql-test/
perl lib/v1/mysql-test-run.pl --start-and-exit --valgrind --skip-ndb --mysqld=--core-file --mysqld=--loose-new --mysqld=--sql-mode=no_engine_substitution --mysq
ld=--relay-log=slave-relay-bin --mysqld=--loose-innodb --mysqld=--loose-falcon-debug-mask=2 --mysqld=--secure-file-priv= --mysqld=--max-allowed-packet=16Mb --mysqld=--loose-innodb-status-file=1 --mysqld=--master-retry-count=65535 --mysqld=--loose-debug-assert-if-crashed-table --mysqld=--loose-debug-assert-on-error --mysqld=--skip-name-resolve --vardir=/ssd/Percona-Server-5.5.28-rel29.2-394-debug.Linux.x86_64/data --master_port=12361 --mysqld="--log-output=none" --mysqld="--sql_mode=ONLY_FULL_GROUP_BY" --mysqld="--innodb_file_per_table=1" --mysqld="--innodb_file_format=barracuda" 1st 2>&1

4. Execute the following (using the percona-qa branch pulled from Launchpad):

/percona-qa/multirun_cli.sh 20 10000000 97.a1 /ssd/Percona-Server-5.5.28-rel29.2-393-debug.Linux.x86_64/bin/mysql /ssd/Percona-Server-5.5.28-rel29.2-393-debug.Linux.x86_64/data/tmp/master.sock

Bit crude script, but it does what it is supposed todo. Just have to point it to mysql binary and socket file as shown above. 97.a1 is a sql file, will be attached here.
[28 Jan 2013 6:37] Roel Van de Paar
sql file

Attachment: 97.a1 (application/octet-stream, text), 1002.70 KiB.

[28 Jan 2013 6:48] Roel Van de Paar
Note: may have to restart it a few times or leave it running for quite some time (read: upto one day) on a fast machine (I use i7+SSD) to run into it. Even then, you may run into other bugs.
[28 Jan 2013 6:49] Laurynas Biveinis
Roel, does it hit sooner under Valgrind perhaps? (it needs long lock wait times to hit...)
[28 Jan 2013 6:54] MySQL Verification Team
this is like:
Bug 13249921 - ASSERT !BPAGE->FILE_PAGE_WAS_FREED, USUALLY IN TRANSACTION ROLLBACK
which was fixed in 5.1.67, 5.5.29, 5.6.8, 5.7.0
[28 Jan 2013 6:55] MySQL Verification Team
I haven't seen this assert in my own tests in a long time.  I will test again, today...
[28 Jan 2013 7:02] Laurynas Biveinis
Shane -

I have looked at rev 2661.810.103 in 5.5, which claims to fix Bug #13249921 ASSERT !BPAGE->FILE_PAGE_WAS_FREED, USUALLY IN TRANSACTION ROLLBACK, and it seems to be completely unrelated, except that the assert is the same.
[1 Feb 2013 7:21] MySQL Verification Team
Hrm, 5.5 still suffers from another known bug "InnoDB: Failing assertion: (ulint) index == offsets[3]" which is not yet fixed.  I keep hitting this over and over.
[27 Dec 2013 10:25] MySQL Verification Team
Still seeing this on blobtester:

Version: '5.5.35-debug'  MySQL Community Server - Debug (GPL)
InnoDB: Assertion failure in thread 9480 in file buf0buf.c line 2578
InnoDB: Failing assertion: mode == BUF_GET_POSSIBLY_FREED || !block->page.file_page_was_freed
[27 Dec 2013 10:26] MySQL Verification Team
When "Bug 14668683 - ASSERT REC_GET_DELETED_FLAG(REC, PAGE_IS_COMP(PAGE))" is fixed I'll retest here.
[10 Jan 2014 15:51] Laurynas Biveinis
Trying to come up with a testcase.  So far managed to discover another crash, see bug 71343.
[10 Jan 2014 16:09] Laurynas Biveinis
Here you go. This reproduces both bug 71343 and the current bug: 71343 by default, and this one if comment out the 71343 crashing ut_ad().

It needs the server patch at 71343.

--source include/have_debug.inc
--source include/have_innodb.inc
--source include/count_sessions.inc

CREATE TABLE innodb_lock_monitor (a INT) ENGINE=INNODB;

CREATE TABLE t1 (a INT PRIMARY KEY, b TEXT) ENGINE=InnoDB;

--disable_query_log
let $i=100;
while ($i) {
      eval INSERT INTO t1 VALUES ($i, REPEAT("a", 4000));
      dec $i;
}
--enable_query_log

BEGIN;
--disable_result_log
SELECT * FROM t1 LOCK IN SHARE MODE;
--enable_result_log

--connect(con1,localhost,root,,)

--connection con1

SET DEBUG_SYNC='innodb_monitor_before_lock_page_read SIGNAL monitor_ready WAIT_FOR finish';

send SHOW ENGINE INNODB STATUS;

--connection default

SET DEBUG_SYNC='now WAIT_FOR monitor_ready';

COMMIT;
DROP TABLE t1;

SET DEBUG_SYNC='now SIGNAL finish';

--connection con1

REAP;

--disconnect con1

--connection default

SET DEBUG_SYNC='RESET';
DROP TABLE innodb_lock_monitor;

--source include/wait_until_count_sessions.inc
[3 Feb 2014 8:38] Laurynas Biveinis
Shane -

Any news on trying out the testcase and setting this to Verified?

Thanks
[3 Feb 2014 8:58] MySQL Verification Team
Sveta repeated apparently it.  So I'll set it verified.
[22 Sep 2014 18:35] Daniel Price
Fixed as of the upcoming 5.6.22, 5.7.6 releases, and here's the changelog entry:

In debug builds, the "InnoDB" Lock Monitor asserted after a "DROP TABLE"
operation, and the "InnoDB" Monitor encountered an assertion in
"buf_page_get_gen". 

Thank you for the bug report.

Note: This bug is fixed by the patch for Bug #18062698 / MySQL Bug #71343.
[10 Dec 2014 13:11] Laurynas Biveinis
The commit does not contain the testcase :(

$ bzr log -r 6171                         
------------------------------------------------------------
revno: 6171
committer: Allen lai <zheng.lai@oracle.com>
branch nick: mysql-5.6
timestamp: Mon 2014-09-22 18:43:17 +0800
message:
  Bug#18062698 - INNODB LOCK MONITOR CRASHES IF TABLE DROPPED AT THE RIGHT
  TIME ON DEBUG BUILD
  
  When buf_page_get_with_no_latch is called in lock_print_info_all_transactions,
  the space maybe deleted by drop table.
  So, we need to check if the space is deleted or not before getting a
  page, and prevent the space be deleted as well.
  
  This patch also fixed bug18173184.
  
  rb#6509 Approved by Jimmy.