Bug #44744 Recovery stage 3: segfault in RecoveryObjects::findRecoveryObject
Submitted: 8 May 2009 9:52 Modified: 26 May 2010 17:47
Reporter: Olav Sandstå Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version:6.0.12-alpha OS:Linux
Assigned to: Olav Sandstå CPU Architecture:Any

[8 May 2009 9:52] Olav Sandstå
Description:
During the recovery after running the RQG test falcon_many_indexes the recovery fails in stage 3 due to a segfault in RecoveryObjects::findRecoveryObject:

The call stack (unfortunately without line number information):

# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(handle_segfault+0x2cd) [0x82c1c55]
# 19:38:14 Recovery has apparently crashed.
# 19:38:14 [0xa53420]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(RecoveryObjects::findRecoveryObject(int, int)+0x6e) [0x86694e0]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(SerialLog::isOverflowPageValid(int, int)+0x25) [0x8627f4b]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(DataPage::deleteOverflowPages(Dbb*, int, unsigned int)+0x44) [0x8659674]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(DataPage::updateRecord(Section*, int, Stream*, unsigned int, bool)+0x85) [0x8659f01]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(Section::updateRecord(int, Stream*, unsigned int, bool)+0x3eb) [0x8623d09]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(Dbb::updateRecord(int, int, Stream*, unsigned int, bool)+0x49) [0x85c171d]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(SRLUpdateBlob::redo()+0x139) [0x8619b1b]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(SerialLog::recover()+0xd79) [0x862c121]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(Database::openDatabase(char const*)+0x341) [0x85b7a9b]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(Connection::getDatabase(char const*, char const*, Threads*)+0x149) [0x85a8719]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(Connection::openDatabase(char const*, char const*, char const*, char const*, char const*, Threads*)+0x198) [0x85a9c36]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(StorageDatabase::getOpenConnection()+0x86) [0x856d98c]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(StorageHandler::initialize()+0x9f) [0x8570075]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(StorageInterface::falcon_init(void*)+0x256) [0x85605ea]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(ha_initialize_handlerton(st_plugin_int*)+0xa3) [0x840bdbb]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld [0x84c3a33]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(plugin_init(int*, char**, int)+0x695) [0x84c4fd9]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld [0x82c48dd]
# 19:38:14 /export/home/pb2/test/sb_1-499003-1241716604.18/mysql-6.0.12-alpha-linux-i686-test/libexec/mysqld(main+0x1de) [0x82c5896]
# 19:38:14 /lib/libc.so.6(__libc_start_main+0xdc) [0x736dec]

How to repeat:
Seen once after running the falcon_many_indexes test.

Suggested fix:
No idea.
[20 May 2009 10:11] Olav Sandstå
By re-running the recovery with a binary built with debug information to get symbol information and line numbers gives the following call stack:

#3  0x08243060 in handle_segfault (sig=142323700) at mysqld.cc:2710
#4  <signal handler called>
#5  0x0847aee3 in Sync::lock (this=0xbfc56590, type=Shared) at Sync.cpp:58
#6  0x0853a58d in RecoveryObjects::findRecoveryObject (this=0xb5379c90, 
    objectNumber=-650182649, tableSpaceId=1) at RecoveryObjects.cpp:115
#7  0x084fdac6 in SerialLog::isOverflowPageValid (this=0xb536e988, 
    pageNumber=-650182649, tableSpaceId=1) at SerialLog.cpp:1184
#8  0x0852b4c0 in DataPage::deleteOverflowPages (this=0xb4ead000, 
    dbb=0xb5412150, overflowPageNumber=-650182649, transId=31774)
    at DataPage.cpp:350
#9  0x0852bcd9 in DataPage::updateRecord (this=0xb4ead000, section=0xb50533d8, 
    lineNumber=51, stream=0xbfc5671c, transId=31774, earlyWrite=false)
    at DataPage.cpp:59
#10 0x084f5dd9 in Section::updateRecord (this=0xb50533d8, recordNumber=3203, 
    stream=0xbfc5671c, transId=31774, earlyWrite=<value optimized out>)
    at Section.cpp:569
#11 0x084b47f5 in Dbb::updateRecord (this=0xb5412150, sectionId=5, 
    recordId=3203, stream=0xbfc5671c, transId=31774, 
    earlyWrite=<value optimized out>) at Dbb.cpp:347
#12 0x0851649e in SRLUpdateBlob::redo (this=0xbfc57130) at SRLUpdateBlob.cpp:74
#13 0x085014e4 in SerialLog::recover (this=0xb536e988) at SerialLog.cpp:471
#14 0x084b09a1 in Database::openDatabase (this=0xb5217648, 
    filename=0xbfc572e8 "/home/olav/bug/bug44744/a1/vardirs/master-data/falcon_master.fts") at Database.cpp:760
#15 0x084a4286 in Connection::getDatabase (this=0xb50165a0, 
    dbName=0xb501626c "FALCON_MASTER", 
    dbFileName=0xbfc572e8 "/home/olav/bug/bug44744/a1/vardirs/master-data/falcon_master.fts", threads=0xb50162c8) at Connection.cpp:1657
#16 0x084a5a24 in Connection::openDatabase (this=0xb50165a0, 
    dbName=0xb501626c "FALCON_MASTER", 
    filename=0xb501629c "falcon_master.fts", account=0x8698fc4 "mysql", 
    password=0x8698fc4 "mysql", address=0x0, parent=0xb50162c8)
    at Connection.cpp:940
#17 0x084720b9 in StorageDatabase::getOpenConnection (this=0xb5016158)
    at StorageDatabase.cpp:137
#18 0x084746dc in StorageHandler::initialize (this=0xb5217028)
    at StorageHandler.cpp:989
#19 0x0846baa2 in StorageInterface::falcon_init (p=0x95653b8)
    at ha_falcon.cpp:263
#20 0x0835542f in ha_initialize_handlerton (plugin=0x9553048) at handler.cc:448
#21 0x083efe24 in plugin_initialize (plugin=0x9553048) at sql_plugin.cc:1011
#22 0x083f2948 in plugin_init (argc=0x880405c, argv=0x95135c8, 
    flags=<value optimized out>) at sql_plugin.cc:1229
#23 0x082446b9 in init_server_components () at mysqld.cc:4101
#24 0x08248283 in main (argc=141840624, argv=0xb5052b3c) at mysqld.cc:4615
[20 May 2009 14:21] Olav Sandstå
Based on looking at the core file this is the findings this far:

1. This crash happens in the following code in RecoveryObjects::findRecoveryObject:

RecoveryPage* RecoveryObjects::findRecoveryObject(int objectNumber, int tableSpaceId)
{
	int slot = objectNumber % RPG_HASH_SIZE;
	Sync sync(&syncArray[slot], "RecoveryObjects::findRecoveryObject");
	sync.lock(Shared);

   This method is called with an objectNumber that is negative: -650182649 which result in access to a negative location in the syncArray.

2. The negative objectNumber value is found as a page number for an "overflow page" in the following code in DataPage::updateRecord():

 
	/* If there is an overflow page (or change), get rid of them now */

	if (index->length < 0)
		{
		char *ptr = (char*) this + index->offset;
		int32 overflowPageNumber;
		memcpy (&overflowPageNumber, ptr, sizeof (int32));

		if (overflowPageNumber)
			deleteOverflowPages (dbb, overflowPageNumber, transId);
		}

    The index entry has the following value:

(gdb) print lineIndex[51]
$61 = {offset = 0, length = -17710}

    when this is "converted" (read) as a page number it becomes negative.

The data page that we are reading has the following content:

(gdb) print *this
$57 = {<Page> = {pageType = 7, checksum = 55615, pageNumber = 3284, 
    pageUnused = 0}, maxLine = 51, lineIndex =     {{offset = 4020, 
      length = 76}}}
(gdb) print lineIndex@60
$56 =   {    {{offset = 4020, length = 76}},
      {{offset = 3944, length = 76}},
      {{offset = 3868, length = 76}},
      {{offset = 3792, length = 76}},
      {{offset = 3716, length = 76}},
      {{offset = 3640, length = 76}},
      {{offset = 3564, length = 76}},
      {{offset = 3488, length = 76}},
      {{offset = 3412, length = 76}},
      {{offset = 3336, length = 76}},
      {{offset = 3260, length = 76}},
      {{offset = 3184, length = 76}},
      {{offset = 3108, length = 76}},
      {{offset = 3032, length = 76}},
      {{offset = 2956, length = 76}},
      {{offset = 2880, length = 76}},
      {{offset = 2804, length = 76}},
      {{offset = 2728, length = 76}},
      {{offset = 2652, length = 76}},
      {{offset = 2576, length = 76}},
      {{offset = 2500, length = 76}},
      {{offset = 2424, length = 76}},
      {{offset = 2348, length = 76}},
      {{offset = 2272, length = 76}},
      {{offset = 2196, length = 76}},
      {{offset = 2120, length = 76}},
      {{offset = 2044, length = 76}},
      {{offset = 1968, length = 76}},
      {{offset = 1892, length = 76}},
      {{offset = 1816, length = 76}},
      {{offset = 1740, length = 76}},
      {{offset = 1664, length = 76}},
      {{offset = 1588, length = 76}},
      {{offset = 1512, length = 76}},
      {{offset = 1436, length = 76}},
      {{offset = 1360, length = 76}},
      {{offset = 1284, length = 76}},
      {{offset = 1208, length = 76}},
      {{offset = 1132, length = 76}},
      {{offset = 1056, length = 76}},
      {{offset = 260, length = 36}},
      {{offset = 980, length = 76}},
      {{offset = 904, length = 76}},
      {{offset = 828, length = 76}},
      {{offset = 752, length = 76}},
      {{offset = 676, length = 76}},
      {{offset = 600, length = 76}},
      {{offset = 524, length = 76}},
      {{offset = 448, length = 76}},
      {{offset = 372, length = 76}},
      {{offset = 296, length = 76}},
      {{offset = 0, length = -17710}},    <<< entry 51 == maxLine
      {{offset = 47826, length = -17710}},
      {{offset = 47826, length = -17710}},
      {{offset = 47826, length = -17710}},
      {{offset = 47826, length = -17710}},
      {{offset = 47826, length = -17710}},
      {{offset = 47826, length = -17710}},
      {{offset = 47826, length = -17710}},
      {{offset = 47826, length = -17710}}}

Some observations:

1. DataPage::updateRecord() is called with a lineNumber = 51 - which is the same as the maxLine for this page. 

2. The content of index 51 on this page when converted to an overflowPageNumber becomes negative.

3. I find it a bit suspisious that all index entries after maxLine has the same length value as the index entry we are using (lenght = 17710). Suspect that index entry 51 is not valid.
[20 May 2009 15:36] Kevin Lewis
Both 47826 and -17710 convert to hex 0xBAD2.  This is intentionally uninitialized.  This seems to say that if the offset is zero, don't use the length.  Or it means that when we set the offset to zero for the last entry, we need to set the length to zero also.
[8 Jun 2009 14:05] Olav Sandstå
A similar crash is reported in Bug#45297.

Common for both of these is that DataPage::updateRecord() is called with a lineNumber that is the *same as* the current maxLine for this page. The lineIndex on this position contains data that causes a the code to believe there is an overflowPage (lineIndex.length is negative) but the found overflowPageNumber is very wrong (negative or very large). My current hypothesis is that it is not correct to call DataPage::updateRecord() with a lineNumber that is equal (or larger) than the current maxLine of the page. Just guessing the reason for this might either be that we somewhere find an illegal value for the lineNumber (in the Record Locator page?) or more likely that we somehow have missed a previous update to the data page?

By running a few of the RQG tests during normal Falcon operation the lineNumber used when calling DataPage::updateRecord() is always smaller than the maxLine but during recovery there are situations where the lineNumber is both equal and larger then the maxLine of the data page.
[12 Jun 2009 13:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/76184

2727 Olav Sandstaa	2009-06-12
      In Bug#44744 and Bug#45297 a crash occures during recovery. In both
      these cases the crash occurs when a record in a data page is attempted
      updated and the record is not already present in the page. This patch
      does not solve this problem but will make the recovery fail when the
      invalid update to the data page occurs instead of failing later when
      the invalid data is used.
      
      Common for both recovery crashes is that DataPage::updateRecord() is
      called with a lineNumber that is equal to the page's maxLine. This is
      not a correct use of this function as DataPage::updateRecord should 
      only be called for already existing records (ie. lineNumber should be
      less than maxLine).
      
      Through repeatedly running recovery I see several cases where this
      happens but does not lead to a crash but instead likely leads to an
      inconsistency. To avoid this inconsistency and to make the recovery
      fail immediately instead of when the invalid data is used this patch
      adds an ASSERT that verify that the lineNumber is valid.
[12 Jun 2009 13:50] Kevin Lewis
The assert is approved to help isolate this problem and protect from related problems.
[22 Jun 2009 12:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/76825

2739 Olav Sandstaa	2009-06-22
      Fix for Bug#44744 "Recovery stage 3: segfault in RecoveryObjects::findRecoveryObject" and Bug#45297 "Crash
      during recovery due to pread reading beyond EOF"
      
      These crashes was both caused by an inconsistency between the Record Locator Page and the Data page. The record 
      locator page claimed that the record was stored in a data page while the record was not present in the data page.
      This situation can happen during recovery if the Record Locator page has been written to disk while the 
      data page has not been written to disk when a server crash occurs. The recovery crash happened when the code
      had locate the record by a lookup in the Record Locator page and then called DataPage::updateRecord() where the
      record was not present. In the crash situation the record expected to be located on the same line number as
      the maxLine of the data page. This lead to reading "garbage" from the data page and resulted in an invalid 
      overflow page number.
      
      This patch fixes this problem by handling that DataPage::updateRecord and DataPage::deleteLine handles to
      be called also when the record is not present in the data page. In this situation both the functions will
      just ignore the call. DataPage::updateRecord() will return that there was not room for the record. This will
      result in the inconsistency between the data page and the Record Locator page being resolved since 
      the "failing" call to DataPage::updateRecord() will trigger removal of the record's entry in the Record Locator 
      page - and the record will be inserted somewhere else.
     @ storage/falcon/DataPage.cpp
        During recovery: extend DataPage::updateRecord() and DataPage::deleteLine() so that they handle to
        be called with a line number that is equal or larger than the current maxLine. If this is the case
        the call will just be ignored.
[22 Jun 2009 13:35] Kevin Lewis
Ann Harrison wrote;
The design of recovery is that changes made in phase 2 must be on their
original pages because they are referenced elsewhere by page number.
Changes made in phase 3 (including storing records) should indifferent
to the page they go on, as long as the page was free - or allocated
for the purpose for which it is reused.

So, I think 2B is OK, and probably follows the original design.
There's no requirement that record end up on their original page,
just that they still have the same record number