Bug #47832 MT DD crash during SR in Dbtup::disk_page_free
Submitted: 5 Oct 2009 10:45 Modified: 2 Nov 2009 21:16
Reporter: Pekka Nousiainen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Disk Data Severity:S2 (Serious)
Version:mysql-5.1-telco-7.0 OS:Any
Assigned to: Pekka Nousiainen CPU Architecture:Any

[5 Oct 2009 10:45] Pekka Nousiainen
Description:
In MT DD it is still possible to get a crash
during System Restart.  Current line# 1197.

DbtupDiskAlloc.cpp
disk_page_free
  const Uint32 *src= ((Fix_page*)pagePtr.p)->get_ptr(page_idx, 0);
> ndbassert(* (src + 1) != Tup_fixsize_page::FREE_RECORD);
  lsn= disk_page_undo_free(pagePtr.p, key,

Other defects with similar symptoms were fixed under bug#41915
(which is closed).  This bug# takes over the rest.

How to repeat:
testSystemRestart -v -n SR_DD_2b
[19 Oct 2009 14:06] Pekka Nousiainen
Above crash is caused (at least) by LCP of fragment
which has become empty (scan pos was not initialized).
So a better test case was:

testSystemRestart -n SR_DD_2b_LCP -r 1 D1
[20 Oct 2009 15:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/87488

3023 Pekka Nousiainen	2009-10-20
      bug#47832 01_lcp.diff
      Fix un-initialized scan position of LCP scan of empty fragment and
      make the scan state a bit more explicit.
[2 Nov 2009 19:58] Jonas Oreland
pushed to 6.2.19, 6.3.28 and 7.0.9
[2 Nov 2009 21:16] Jon Stephens
Documented bugfix in the NDB-6.2.19, 6.3.28, and 7.0.9 changelogs, as follows:

        A local checkpoint of an empty fragment could cause a data node
        crash.

Closed.
[3 Nov 2009 7:15] Jon Stephens
Per discussion with Pekka, revised changelog entry to read

        A local checkpoint of an empty fragment could cause a crash
        during a system restart which was based on that LCP.