Bug #36294 Assertion in Cache::writePage
Submitted: 23 Apr 2008 15:32 Modified: 5 Oct 2008 15:19
Reporter: Vladislav Vaintroub Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S3 (Non-critical)
Version:mysql-6.0-falcon-team,6.0.5 OS:Any
Assigned to: Vladislav Vaintroub CPU Architecture:Any

[23 Apr 2008 15:32] Vladislav Vaintroub
Description:
while trying to reproduce Bug#29648, hit assertion

Assert happends during insert/select of 2 Mio rows
The assertion:
	ASSERT(!(bdb->flags & BDB_write_pending));

bdb->flags is 17 (==BDB_dirty|BDB_write_pending)

The callstack:
0000000077244EA0    ntdll.dll!DbgBreakPoint()
00000001407D9850    mysqld.exe!Error::debugBreak()[error.cpp:94]
00000001407D972B    mysqld.exe!Error::error()[error.cpp:71]
00000001407D980C    mysqld.exe!Error::assertionFailed()[error.cpp:77]
0000000140874793    mysqld.exe!Cache::writePage()[cache.cpp:514]
0000000140874184    mysqld.exe!Cache::findBuffer()[cache.cpp:416]
000000014087375D    mysqld.exe!Cache::fetchPage()[cache.cpp:249]
0000000140850692    mysqld.exe!Dbb::fetchPage()[dbb.cpp:220]
00000001408508CA    mysqld.exe!Dbb::handoffPage()[dbb.cpp:268]
000000014085B71D    mysqld.exe!Section::fetchRecord()[section.cpp:816]
00000001408511B8    mysqld.exe!Dbb::fetchRecord()[dbb.cpp:394]
00000001407EC147    mysqld.exe!Table::databaseFetch()[table.cpp:611]
00000001407EBAB7    mysqld.exe!Table::fetchNext()[table.cpp:498]
00000001407CC44A    mysqld.exe!StorageDatabase::nextRow()[storagedatabase.cpp:242]
00000001407BB6F6    mysqld.exe!StorageTable::next()[storagetable.cpp:161]
00000001407AA2E6    mysqld.exe!StorageInterface::rnd_next()[ha_falcon.cpp:568]
0000000140226EF7    mysqld.exe!rr_sequential()[records.cc:366]
00000001403A3505    mysqld.exe!sub_select()[sql_select.cc:13346]
00000001403A2F78    mysqld.exe!do_select()[sql_select.cc:13087]
00000001403810CD    mysqld.exe!JOIN::exec()[sql_select.cc:2340]
000000014038382A    mysqld.exe!mysql_select()[sql_select.cc:2929]
0000000140379A9E    mysqld.exe!handle_select()[sql_select.cc:289]
000000014032D198    mysqld.exe!mysql_execute_command()[sql_parse.cc:2982]
000000014033694D    mysqld.exe!mysql_parse()[sql_parse.cc:5756]
0000000140328351    mysqld.exe!dispatch_command()[sql_parse.cc:1047]
00000001403278B7    mysqld.exe!do_command()[sql_parse.cc:722]
0000000140493D86    mysqld.exe!handle_one_connection()[sql_connect.cc:1134]
00000001405A4F15    mysqld.exe!pthread_start()[my_winthread.c:86]
0000000140911165    mysqld.exe!_callthreadstart()[thread.c:295]
0000000140911137    mysqld.exe!_threadstart()[thread.c:277]

How to repeat:
Here are the statements that lead to error 

set falcon_record_memory_max=500*1024*1024;
create tablespace gimdata add datafile 'gimdata.fts' engine=falcon;
create table ftest (id serial, col1 varchar(500)) engine=falcon tablespace gimdata;
insert into ftest values (null,repeat("a",500));
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;
insert into ftest(col1) select col1 from ftest;

create table t2 (c1 int) engine=falcon tablespace gimdata;
insert into t2 values(1);
[24 Apr 2008 10:59] MySQL Verification Team
Thank you for the bug report.
[29 Apr 2008 9:04] Vladislav Vaintroub
A new stacktrace from todays run

0000000077B34EA0    ntdll.dll!DbgBreakPoint()
00000001407DA980    mysqld.exe!Error::debugBreak()[error.cpp:94]
00000001407DA85B    mysqld.exe!Error::error()[error.cpp:71]
00000001407DA93C    mysqld.exe!Error::assertionFailed()[error.cpp:77]
000000014089CFA2    mysqld.exe!SerialLogControl::nextRecord()[seriallogcontrol.cpp:316]
00000001408D894B    mysqld.exe!SerialLogTransaction::commit()[seriallogtransaction.cpp:87]
00000001408D8DCA    mysqld.exe!SerialLogTransaction::doAction()[seriallogtransaction.cpp:158]
00000001408D6793    mysqld.exe!Gopher::gopherThread()[gopher.cpp:71]
00000001408D656D    mysqld.exe!Gopher::gopherThread()[gopher.cpp:38]
00000001408062DB    mysqld.exe!Thread::thread()[thread.cpp:163]
0000000140806227    mysqld.exe!Thread::thread()[thread.cpp:141]
00000000779E495D    kernel32.dll!BaseThreadInitThunk()
0000000077B18791    ntdll.dll!RtlUserThreadStart()

The assertion at seriallogcontrol.cpp:316 is

ASSERT(*input >= (LOW_BYTE_FLAG | srlEnd) && *input < (LOW_BYTE_FLAG | srlMax));

input points to the "aaaaaaaa.." string 

As all string values in the test case are repeat('a',500() that means, 

input is the in middle of the record.
[29 Apr 2008 9:51] Vladislav Vaintroub
Running under debugger showed yet another crash.

now this is in Cache::fetchPage/IO::readPage/IO::longSeek, due to fileId == -1

in the error log, there is a C runtime warning
lseeki64.c(90) : Assertion failed: ("Invalid file descriptor. File possibly closed by a different thread",0)
The file is the tablespace  gimdata.fts.

the debugger output indicates there were C++ exception previously , so the file file probably was closed in one of the catch blocks.

Here the callstack:

 	mysqld.exe!Error::error(const char * string=0x0000000140dd03b0, ...)  Line 71	C++
>	mysqld.exe!IO::longSeek(__int64 offset=5066752)  Line 364	C++
 	mysqld.exe!IO::pread(__int64 offset=5066752, int length=4096, unsigned char * buffer=0x0000000004dff000)  Line 508	C++
 	mysqld.exe!IO::readPage(Bdb * bdb=0x000000000478d338)  Line 238 + 0x24 bytes	C++
 	mysqld.exe!Cache::fetchPage(Dbb * dbb=0x0000000004978aa0, int pageNumber=1237, PageType pageType=PAGE_any, LockType lockType=Exclusive)  Line 261	C++
 	mysqld.exe!Dbb::fetchPage(int pageNumber=1237, PageType pageType=PAGE_any, LockType lockType=Exclusive)  Line 220	C++
 	mysqld.exe!IndexRootPage::deleteIndex(Dbb * dbb=0x0000000004978aa0, int indexId=0, unsigned int transId=98)  Line 700 + 0x1d bytes	C++
 	mysqld.exe!SRLDeleteIndex::commit()  Line 123	C++
 	mysqld.exe!SerialLogTransaction::commit()  Line 93	C++
 	mysqld.exe!SerialLogTransaction::doAction()  Line 158	C++
 	mysqld.exe!Gopher::gopherThread()  Line 71	C++
 	mysqld.exe!Gopher::gopherThread(void * arg=0x0000000004932358)  Line 38	C++
 	mysqld.exe!Thread::thread()  Line 163	C++
 	mysqld.exe!Thread::thread(void * parameter=0x000000000493e180)  Line 141	C++
 	kernel32.dll!BaseThreadInitThunk()  + 0xd bytes	
 	ntdll.dll!RtlUserThreadStart()  + 0x21 bytes
[7 May 2008 15:50] Philip Stoev
The initial assertion from this bug also happens when running the large_tests.alter_table test under Falcon on Windows, which is similar in nature to the test Vlad is using. I used 8GB of falcon_record_memory_max:

$ perl mysql-test-run.pl --big-test --master_port=123456 --suite=large_tests --testcase-timeout=10000 --suite-timeout=10000 --mysqld=--falcon_record_memory_max=8GB

Backtrace is :

>	ntdll.dll!0000000077ef2aa0() 	
 	[Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll]	
 	mysqld.exe!Error::error(const char * string=0x0000000000b35cf0, ...)  Line 71	C++
 	mysqld.exe!Error::assertionFailed(const char * fileName=0x0000000000000000, int line=0)  Line 76 + 0xf bytes	C++
 	mysqld.exe!Cache::writePage(Bdb * bdb=0x0000000001aae2a0, int type=1)  Line 514 + 0x19 bytes	C++
 	mysqld.exe!Cache::findBuffer(Dbb * dbb=0x0000000001c61220, int pageNumber=30020776, LockType lockType=5839)  Line 416	C++
 	mysqld.exe!Cache::fetchPage(Dbb * dbb=0x0000000000000000, int pageNumber=0, PageType pageType=PAGE_any, LockType lockType=Exclusive)  Line 249 + 0x14 bytes	C++
 	mysqld.exe!Dbb::fetchPage(int pageNumber=1050767, PageType pageType=64572192, LockType lockType=32587776)  Line 220	C++
 	mysqld.exe!Dbb::handoffPage(Bdb * bdb=0x0000000001ca14a8, int pageNumber=25423, PageType pageType=33873920, LockType lockType=Exclusive)  Line 270	C++
 	mysqld.exe!IndexRootPage::addIndexEntry(Dbb * dbb=0x0000000000000000, int indexId=2, IndexKey * key=0x00000000ffffffff, int recordNumber=0, unsigned int transId=0)  Line 167 + 0x1d bytes	C++
 	mysqld.exe!IndexRootPage::indexMerge(Dbb * dbb=0x0000000000000000, int indexId=9227757, SRLUpdateIndex * logRecord=0x0000000000000000, unsigned int transId=29762528)  Line 1006	C++
 	mysqld.exe!SRLUpdateIndex::commit()  Line 158	C++
 	mysqld.exe!SerialLogTransaction::commit()  Line 87	C++
 	mysqld.exe!Gopher::gopherThread()  Line 71	C++
 	mysqld.exe!Thread::thread()  Line 163	C++
 	mysqld.exe!Thread::thread(void * parameter=0x0000000077d6b6a0)  Line 141	C++
 	kernel32.dll!0000000077d6b6da() 	
 	kernel32.dll!0000000077d6b6a0()
[9 May 2008 5:05] Kevin Lewis
Jim pushed this change today which should make the flags variable thread safe.
The BDB_write_pending flag that was involved in this assertion is no longer used.
It was not really needed anymore.

ChangeSet@1.2667.1.7, 2008-05-08 16:37:28-04:00, jas@rowvwade. +4 -0
  Replaced Bdb::flags with individual booleans for "dirty" and "registered"
  (by PageWriter) to get around non-thread safe usage.
  This is a fix for bugs 36294 and 36367.
[29 Aug 2008 2:04] Kevin Lewis
This fix is in version 6.0.6
[28 Sep 2008 7:56] Hakan Küçükyılmaz
When running falcon_bug_36294 in ramdisk following assertion is hit sometimes:

./mysql-test-run.pl --force --mem --skip-ndb --suite=falcon_team falcon_bug_36294

=======================================================

TEST                           RESULT         TIME (ms)
-------------------------------------------------------

falcon_team.falcon_bug_36294   [ fail ]

mysqltest: At line 65: query 'DROP TABLE ftest' failed: 2006: MySQL server has gone away

The result from queries just before the failure was:
*** Bug #36294 ***
SET @@storage_engine = 'Falcon';
DROP TABLE IF EXISTS ftest;

/home/hakan/work/mysql/mysql-6.0-falcon-team/sql/mysqld(my_print_stacktrace+0x2e)[0xa5e378]
/home/hakan/work/mysql/mysql-6.0-falcon-team/sql/mysqld(handle_segfault+0x25c)[0x66eacd]
/lib/libpthread.so.0[0x7fd79a9eba90]
/lib/libpthread.so.0(raise+0x2b)[0x7fd79a9eb94b]
/home/hakan/work/mysql/mysql-6.0-falcon-team/sql/mysqld(Error::debugBreak()+0xe)[0x8b2e48]
/home/hakan/work/mysql/mysql-6.0-falcon-team/sql/mysqld(Error::error(char const*, ...)+0xea)[0x8b2f38]
/home/hakan/work/mysql/mysql-6.0-falcon-team/sql/mysqld(Error::assertionFailed(char const*, char const*, int)+0x19)[0x8b2f9b]
/home/hakan/work/mysql/mysql-6.0-falcon-team/sql/mysqld(SerialLogControl::nextRecord()+0x38)[0x8eddb4]
/home/hakan/work/mysql/mysql-6.0-falcon-team/sql/mysqld(SerialLogTransaction::commit()+0xc8)[0x8f4a70]
/home/hakan/work/mysql/mysql-6.0-falcon-team/sql/mysqld(SerialLogTransaction::doAction()+0x11)[0x8f4af3]
/home/hakan/work/mysql/mysql-6.0-falcon-team/sql/mysqld(Gopher::gopherThread()+0x11c)[0x91d296]
/home/hakan/work/mysql/mysql-6.0-falcon-team/sql/mysqld(Gopher::gopherThread(void*)+0x9)[0x91d3bb]
/home/hakan/work/mysql/mysql-6.0-falcon-team/sql/mysqld(Thread::thread()+0x3f)[0x884739]
/home/hakan/work/mysql/mysql-6.0-falcon-team/sql/mysqld(Thread::thread(void*)+0x9)[0x8848bb]
/lib/libpthread.so.0[0x7fd79a9e3fc7]
/lib/libc.so.6(clone+0x6d)[0x7fd79953f7cd]
[Falcon] Error: assertion (*input >= (LOW_BYTE_FLAG | srlEnd) && *input < (LOW_BYTE_FLAG | srlMax)) failed at line 316 in file SerialLogControl.cpp
[28 Sep 2008 10:36] Vladislav Vaintroub
Hakan, this problem about assertion in Cache::writePage(), not in SerialLog::nextRecord(). The assertion you're seeeing is the synopsis for Bug#36294
[28 Sep 2008 10:42] Vladislav Vaintroub
small correction to previous entry : the correct bugnr for crash in SerialLogControl::nextRecord() is Bug#36631.
[29 Sep 2008 8:12] Hakan Küçükyılmaz
Uups, sorry Vlad. You are right!
[5 Oct 2008 15:19] Jon Stephens
Documented bugfix in the 6.0.6 changelog as follows:

        Following a number of INSERT ... SELECT statements on a Falcon table,
        creating a second Falcon table using the same tablespace as the table
        into which the inserts were made, then performing a simple INSERT on the
        new table, caused the server to crash.