Bug #44607 Ndb : Fragmented long signals need node failure handling code
Submitted: 1 May 2009 14:58 Modified: 8 Oct 2009 16:49
Reporter: Frazer Clement Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.2 OS:Any
Assigned to: Frazer Clement CPU Architecture:Any

[1 May 2009 14:58] Frazer Clement
Description:
Kernel nodes use generic VM facilities to assemble fragmented long signals.

The assembly mechanism allocates FragmentInfo structures and these contain liked segments as they are being assembled.

If a node fails while sending a fragmented long signal, the receiving node will have assembly resources allocated.

Currently there is no code which frees long signal assembly resources when the sending node fails.

In this scenario, assembly resources (FragInfo structures and Segmented Sections) would be leaked. Eventually this could impair system function and require a node restart to alleviate.

How to repeat:
Modify API code to generate artificial failure during send of fragmented long signal.

Repeatedly send ScanTabReq with large filter requiring fragmented long signal and hitting failure during send.

Observe that long signal reassmbly resources are depleted.

Suggested fix:
Add node failure handling (for failed API and Data nodes) which frees FragInfo structures and assembled section segments in case of node failure.
[6 May 2009 12:14] Jonas Oreland
comment: in 7.0 since almost all traffic now uses long signals,
priority of this bug increases...
[12 Aug 2009 13:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/80674

2962 Frazer Clement	2009-08-12
      Bug#44607 Ndb : Fragmented long signals need node failure handling code
      modified:
        storage/ndb/include/kernel/signaldata/ContinueFragmented.hpp
        storage/ndb/include/kernel/signaldata/NodeFailRep.hpp
        storage/ndb/src/kernel/blocks/backup/Backup.cpp
        storage/ndb/src/kernel/blocks/dbdict/Dbdict.cpp
        storage/ndb/src/kernel/blocks/dbdict/Dbdict.hpp
        storage/ndb/src/kernel/blocks/dbtc/Dbtc.hpp
        storage/ndb/src/kernel/blocks/dbtc/DbtcMain.cpp
        storage/ndb/src/kernel/blocks/dbtup/Dbtup.hpp
        storage/ndb/src/kernel/blocks/dbtup/DbtupGen.cpp
        storage/ndb/src/kernel/blocks/dbutil/DbUtil.cpp
        storage/ndb/src/kernel/blocks/dbutil/DbUtil.hpp
        storage/ndb/src/kernel/blocks/lgman.cpp
        storage/ndb/src/kernel/blocks/lgman.hpp
        storage/ndb/src/kernel/blocks/ndbcntr/NdbcntrMain.cpp
        storage/ndb/src/kernel/blocks/suma/Suma.cpp
        storage/ndb/src/kernel/blocks/suma/Suma.hpp
        storage/ndb/src/kernel/blocks/tsman.cpp
        storage/ndb/src/kernel/blocks/tsman.hpp
        storage/ndb/src/kernel/vm/SimulatedBlock.cpp
        storage/ndb/src/kernel/vm/SimulatedBlock.hpp
[3 Sep 2009 16:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/82349

2976 Frazer Clement	2009-09-03
      Bug#44607 : Fragmented long signals need node failure handling
      modified:
        storage/ndb/include/kernel/signaldata/ContinueFragmented.hpp
        storage/ndb/include/kernel/signaldata/DumpStateOrd.hpp
        storage/ndb/include/kernel/signaldata/NodeFailRep.hpp
        storage/ndb/src/kernel/blocks/backup/Backup.cpp
        storage/ndb/src/kernel/blocks/cmvmi/Cmvmi.cpp
        storage/ndb/src/kernel/blocks/cmvmi/Cmvmi.hpp
        storage/ndb/src/kernel/blocks/dbdict/Dbdict.cpp
        storage/ndb/src/kernel/blocks/dbdict/Dbdict.hpp
        storage/ndb/src/kernel/blocks/dbtc/Dbtc.hpp
        storage/ndb/src/kernel/blocks/dbtc/DbtcMain.cpp
        storage/ndb/src/kernel/blocks/dbtup/Dbtup.hpp
        storage/ndb/src/kernel/blocks/dbtup/DbtupGen.cpp
        storage/ndb/src/kernel/blocks/dbutil/DbUtil.cpp
        storage/ndb/src/kernel/blocks/dbutil/DbUtil.hpp
        storage/ndb/src/kernel/blocks/lgman.cpp
        storage/ndb/src/kernel/blocks/lgman.hpp
        storage/ndb/src/kernel/blocks/ndbcntr/NdbcntrMain.cpp
        storage/ndb/src/kernel/blocks/suma/Suma.cpp
        storage/ndb/src/kernel/blocks/suma/Suma.hpp
        storage/ndb/src/kernel/blocks/tsman.cpp
        storage/ndb/src/kernel/blocks/tsman.hpp
        storage/ndb/src/kernel/vm/SimulatedBlock.cpp
        storage/ndb/src/kernel/vm/SimulatedBlock.hpp
[10 Sep 2009 16:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/82972

2993 Frazer Clement	2009-09-10
      Bug#44607 : Fragmented signal node failure handling
      modified:
        storage/ndb/include/kernel/signaldata/ContinueFragmented.hpp
        storage/ndb/include/kernel/signaldata/DumpStateOrd.hpp
        storage/ndb/include/kernel/signaldata/NodeFailRep.hpp
        storage/ndb/src/kernel/blocks/backup/Backup.cpp
        storage/ndb/src/kernel/blocks/cmvmi/Cmvmi.cpp
        storage/ndb/src/kernel/blocks/cmvmi/Cmvmi.hpp
        storage/ndb/src/kernel/blocks/dbdict/Dbdict.cpp
        storage/ndb/src/kernel/blocks/dbdict/Dbdict.hpp
        storage/ndb/src/kernel/blocks/dbtc/Dbtc.hpp
        storage/ndb/src/kernel/blocks/dbtc/DbtcMain.cpp
        storage/ndb/src/kernel/blocks/dbtup/Dbtup.hpp
        storage/ndb/src/kernel/blocks/dbtup/DbtupGen.cpp
        storage/ndb/src/kernel/blocks/dbutil/DbUtil.cpp
        storage/ndb/src/kernel/blocks/dbutil/DbUtil.hpp
        storage/ndb/src/kernel/blocks/lgman.cpp
        storage/ndb/src/kernel/blocks/lgman.hpp
        storage/ndb/src/kernel/blocks/ndbcntr/NdbcntrMain.cpp
        storage/ndb/src/kernel/blocks/suma/Suma.cpp
        storage/ndb/src/kernel/blocks/suma/Suma.hpp
        storage/ndb/src/kernel/blocks/tsman.cpp
        storage/ndb/src/kernel/blocks/tsman.hpp
        storage/ndb/src/kernel/vm/SimulatedBlock.cpp
        storage/ndb/src/kernel/vm/SimulatedBlock.hpp
[8 Oct 2009 10:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/86119

3021 Frazer Clement	2009-10-08
      Bug#44607 : Ndb : Fragmented long signals need node failure handling code
      modified:
        storage/ndb/include/kernel/signaldata/ContinueFragmented.hpp
        storage/ndb/include/kernel/signaldata/DumpStateOrd.hpp
        storage/ndb/include/kernel/signaldata/NodeFailRep.hpp
        storage/ndb/src/kernel/blocks/backup/Backup.cpp
        storage/ndb/src/kernel/blocks/cmvmi/Cmvmi.cpp
        storage/ndb/src/kernel/blocks/cmvmi/Cmvmi.hpp
        storage/ndb/src/kernel/blocks/dbdict/Dbdict.cpp
        storage/ndb/src/kernel/blocks/dbdict/Dbdict.hpp
        storage/ndb/src/kernel/blocks/dbtc/Dbtc.hpp
        storage/ndb/src/kernel/blocks/dbtc/DbtcMain.cpp
        storage/ndb/src/kernel/blocks/dbtup/Dbtup.hpp
        storage/ndb/src/kernel/blocks/dbtup/DbtupGen.cpp
        storage/ndb/src/kernel/blocks/dbutil/DbUtil.cpp
        storage/ndb/src/kernel/blocks/dbutil/DbUtil.hpp
        storage/ndb/src/kernel/blocks/lgman.cpp
        storage/ndb/src/kernel/blocks/lgman.hpp
        storage/ndb/src/kernel/blocks/ndbcntr/NdbcntrMain.cpp
        storage/ndb/src/kernel/blocks/suma/Suma.cpp
        storage/ndb/src/kernel/blocks/suma/Suma.hpp
        storage/ndb/src/kernel/blocks/tsman.cpp
        storage/ndb/src/kernel/blocks/tsman.hpp
        storage/ndb/src/kernel/vm/SimulatedBlock.cpp
        storage/ndb/src/kernel/vm/SimulatedBlock.hpp
[8 Oct 2009 12:39] Bugs System
Pushed into 5.1.39-ndb-7.1.0 (revid:frazer@mysql.com-20091008123055-04p00c7sltllb92o) (version source revid:frazer@mysql.com-20091008123055-04p00c7sltllb92o) (merge vers: 5.1.39-ndb-7.1.0) (pib:11)
[8 Oct 2009 14:51] Frazer Clement
Fix pushed to 
6.2.19
6.3.28
7.0.9
7.1.0
[8 Oct 2009 16:49] Jon Stephens
Documented fix in the NDB-6.2.19, 6.3.28, and 7.0.9 changelogs, as follows:

        If a node failed while sending a fragmented long signal, the
        receiving node did not free long signal assembly resources that
        it had allocated for the fragments of the long signal that had
        already been received.

Closed.
[9 Oct 2009 8:16] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/86279
[15 Oct 2009 15:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/87008

3018 Martin Skold	2009-10-15 [merge]
      Merge
      modified:
        config/ac-macros/ha_ndbcluster.m4
        mysql-test/suite/ndb/my.cnf
        mysql-test/suite/ndb/r/ndb_config.result
        mysql-test/suite/ndb_binlog/r/ndb_binlog_variants.result
        mysql-test/suite/ndb_binlog/t/ndb_binlog_variants.test
        mysql-test/suite/ndb_team/r/ndb_dd_backuprestore.result
        sql/ha_ndbcluster.cc
        sql/ha_ndbcluster.h
        storage/ndb/include/kernel/signaldata/ContinueFragmented.hpp
        storage/ndb/include/kernel/signaldata/DumpStateOrd.hpp
        storage/ndb/include/kernel/signaldata/NodeFailRep.hpp
        storage/ndb/include/ndb_global.h.in
        storage/ndb/include/ndbapi/NdbDictionary.hpp
        storage/ndb/include/ndbapi/NdbOperation.hpp
        storage/ndb/ndbapi-examples/ndbapi_scan/ndbapi_scan.cpp
        storage/ndb/src/kernel/blocks/backup/Backup.cpp
        storage/ndb/src/kernel/blocks/backup/Backup.hpp
        storage/ndb/src/kernel/blocks/cmvmi/Cmvmi.cpp
        storage/ndb/src/kernel/blocks/cmvmi/Cmvmi.hpp
        storage/ndb/src/kernel/blocks/dbdict/Dbdict.cpp
        storage/ndb/src/kernel/blocks/dbdict/Dbdict.hpp
        storage/ndb/src/kernel/blocks/dbdict/printSchemaFile.cpp
        storage/ndb/src/kernel/blocks/dbdih/printSysfile.cpp
        storage/ndb/src/kernel/blocks/dbdih/printSysfile/printSysfile.cpp
        storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp
        storage/ndb/src/kernel/blocks/dblqh/redoLogReader/reader.cpp
        storage/ndb/src/kernel/blocks/dbtc/Dbtc.hpp
        storage/ndb/src/kernel/blocks/dbtc/DbtcMain.cpp
        storage/ndb/src/kernel/blocks/dbtup/Dbtup.hpp
        storage/ndb/src/kernel/blocks/dbtup/DbtupGen.cpp
        storage/ndb/src/kernel/blocks/dbtup/tuppage.hpp
        storage/ndb/src/kernel/blocks/dbtux/Dbtux.hpp
        storage/ndb/src/kernel/blocks/dbutil/DbUtil.cpp
        storage/ndb/src/kernel/blocks/dbutil/DbUtil.hpp
        storage/ndb/src/kernel/blocks/lgman.cpp
        storage/ndb/src/kernel/blocks/lgman.hpp
        storage/ndb/src/kernel/blocks/ndbcntr/NdbcntrMain.cpp
        storage/ndb/src/kernel/blocks/qmgr/QmgrMain.cpp
        storage/ndb/src/kernel/blocks/suma/Suma.cpp
        storage/ndb/src/kernel/blocks/suma/Suma.hpp
        storage/ndb/src/kernel/blocks/trix/Trix.hpp
        storage/ndb/src/kernel/blocks/tsman.cpp
        storage/ndb/src/kernel/blocks/tsman.hpp
        storage/ndb/src/kernel/vm/DLFifoList.hpp
        storage/ndb/src/kernel/vm/DLHashTable.hpp
        storage/ndb/src/kernel/vm/DLList.hpp
        storage/ndb/src/kernel/vm/DataBuffer.hpp
        storage/ndb/src/kernel/vm/SimulatedBlock.cpp
        storage/ndb/src/kernel/vm/SimulatedBlock.hpp
        storage/ndb/src/mgmapi/LocalConfig.cpp
        storage/ndb/src/mgmapi/Makefile.am
        storage/ndb/src/mgmsrv/ConfigInfo.cpp
        storage/ndb/src/ndbapi/NdbDictionary.cpp
        storage/ndb/src/ndbapi/NdbOperation.cpp
        storage/ndb/test/ndbapi/testNdbApi.cpp
        storage/ndb/test/run-test/daily-basic-tests.txt