Bug #38628 crash in NdbScanOperation::takeOverScanOp
Submitted: 7 Aug 2008 12:10 Modified: 5 Oct 2008 16:32
Reporter: Gunn Olaussen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.3 OS:Solaris (SunOS 5.10 Generic_120011-14 sun4u sparc SUNW,Sun-Fire-V210)
Assigned to: Jonas Oreland CPU Architecture:Any

[7 Aug 2008 12:10] Gunn Olaussen
Description:
When running daily-basic on solaris sparc rig in trondheim there were 130 core files and all that I looked at where the same. Doing the same manually:

[ndbdev@techra35]~/autotest2/run/run-manual/run/ndb_mgmd.1: testBasic -n DeleteRead
random seed: 3582926507
testBasic started [2008-08-07 07:51:17]
| -  T1
- DeleteRead started  [2008-08-07 07:51:18]
- DeleteRead PASSED  [2008-08-07 07:51:18]
| -  T2
- DeleteRead started [2008-08-07 07:51:20]
Bus Error (core dumped)

Core was generated by `testBasic -n DeleteRead'.
Program terminated with signal 10, Bus error.
#0  0xff17e4bc in NdbScanOperation::takeOverScanOp (this=0x6dfa38,
    opType=NdbOperation::DeleteRequest, pTrans=0x5f4750) at NdbScanOperation.cpp:2167
2167    NdbScanOperation.cpp: No such file or directory.
        in NdbScanOperation.cpp
(gdb) where
#0  0xff17e4bc in NdbScanOperation::takeOverScanOp (this=0x6dfa38,
 opType=NdbOperation::DeleteRequest, pTrans=0x5f4750)
     at NdbScanOperation.cpp:2167
#1  0x0003be54 in UtilTransactions::clearTable (this=0xffbff5d0, pNdb=0x5f4228, flags=0, records=327680, parallelism=240) 
  at ../../../../storage/ndb/include/ndbapi/NdbScanOperation.hpp:596
#2  0x0001fb3c in runClearTable2 (ctx=0xe82d0, step=0xffbff5d0) at testBasic.cpp:362
#3  0x0002ab58 in NDBT_Step::execute (this=0xaecc0, ctx=0xe82d0) at NDBT_Test.cpp:291
#4  0x0002acd0 in NDBT_TestCaseImpl1::runFinal (this=0xc1fec,
 ctx=0xe82d0) at NDBT_Test.cpp:726
#5  0x0002a924 in NDBT_TestCase::execute (this=0xc1f50, ctx=0xe82d0)
 at NDBT_Test.cpp:654
#6  0x00028a78 in NDBT_TestSuite::execute (this=0x62fb0,
 con=@0xffbffa98, ndb=0xffbff888, pTab=0x63ad0,
     _testname=0xffbffba5 "DeleteRead") at NDBT_Test.cpp:1097
#7  0x00029d64 in NDBT_TestSuite::executeAll (this=0x62fb0,
 con=@0xffbffa28, _testname=0xffbffba5 "DeleteRead",
     _testname=0xffbffc40 "Fill") at NDBT_Test.cpp:849
#8  0x0002a61c in NDBT_TestSuite::execute (this=0x62fb0, argc=0,
 argv=0xe) at NDBT_Test.cpp:1376
#9  0x0001e764 in _start ()

How to repeat:
Problem can be reproduced by starting the cluster manually and running any of the failing test cases (see example of one under Description). my.cnf contained:

[atrt]
basedir = /home/ndbdev/autotest2/run/run-manual/run
baseport = 14000
clusters = .2node

[ndb_mgmd]

[mysqld]
skip-innodb
skip-bdb

[cluster_config.2node]
ndb_mgmd = techra35
ndbd = techra36,techra37
ndbapi= techra35,techra35,techra35

NoOfReplicas = 2
IndexMemory = 100M 
DataMemory = 300M
BackupMemory = 64M
MaxNoOfConcurrentScans = 100
MaxNoOfSavedMessages= 1000
SendBufferMemory = 2M
NoOfFragmentLogFiles = 4
FragmentLogFileSize = 64M
CompressedLCP=1
CompressedBackup=1
ODirect=1

#
# Generated by atrt
# Thu Aug  7 09:26:13 2008

[mysql_cluster.2node]
ndb-connectstring= techra35:14000

[cluster_config.ndb_mgmd.1.2node]
PortNumber= 14000

[cluster_config.ndbd.1.2node]
FileSystemPath= /home/ndbdev/autotest2/run/run-manual/run/

[cluster_config.ndbd.2.2node]
FileSystemPath= /home/ndbdev/autotest2/run/run-manual/run/
[8 Aug 2008 12:28] Jonas Oreland
proposed fix

Attachment: bug38628.patch (text/x-patch), 1.87 KiB.

[13 Aug 2008 20:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/51566

2649 Jonas Oreland	2008-08-13
      ndb - bug#38628 - Fix invalid memory access in takeOverScanOp
        (causes bus-error on i.e sparc)
[13 Aug 2008 20:27] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/51574

2649 Jonas Oreland	2008-08-13
      ndb - bug#38628 - Fix invalid memory access in takeOverScanOp
        (causes bus-error on i.e sparc)
[13 Aug 2008 20:29] Jonas Oreland
pushed to 62 63 and 64
[19 Aug 2008 6:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/51912

2650 Tomas Ulin	2008-08-19
      continued fix for reset master
[12 Sep 2008 7:37] Jon Stephens
Documented in the NDB 6.2.16 and 6.3.17 changelogs as follows:

        An invalid memory access caused the management server to crash on
        Solaris Sparc platforms.
[5 Oct 2008 16:32] Jon Stephens
Already documented; closed.
[12 Dec 2008 23:29] Bugs System
Pushed into 6.0.7-alpha  (revid:jonas@mysql.com-20080813200401-ly735jw1t0j4eawe) (version source revid:tomas.ulin@sun.com-20080902154454-pvi3xa61d2wtxtbg) (pib:5)