Bug #46490 Full table scan hangs with more than 21 fragments
Submitted: 31 Jul 2009 12:12 Modified: 6 Aug 2009 8:13
Reporter: Oli Sennhauser Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:mysql-5.1-telco-6.2 OS:Any
Assigned to: Frazer Clement CPU Architecture:Any
Tags: cluster, fragment, full, hang, scan, table

[31 Jul 2009 12:12] Oli Sennhauser
Description:
MySQL cluster hangs on full table scans and gets error at the end:

ndb_select_all cache_update -d vm_vs > cache_update.select
ERROR: 4008 Receive from NDB failed
           Status: Unknown result, Classification: Unknown result error
           File: select_all.cpp (Line: 442)

How to repeat:
There seems to be a rule not to create > 4 fragments/node so you need > 4 nodes to reproduce
Then put data in and try some scans
Sooner or later should get a hang
In debug, the 'unhandled signals after execute()' results in a node crash

Suggested fix:
Frazer is looking...
[4 Aug 2009 10:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/79988

2960 Frazer Clement	2009-08-04
      Bug#46490 : Full table scan hangs with more than 21 fragments
      modified:
        storage/ndb/include/kernel/signaldata/ScanTab.hpp
        storage/ndb/src/kernel/blocks/dbtc/DbtcMain.cpp
[4 Aug 2009 10:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/79991

2960 Frazer Clement	2009-08-04
      Bug#46490 : Full table scan hangs with more than 21 fragments
      modified:
        storage/ndb/include/kernel/signaldata/ScanTab.hpp
        storage/ndb/src/kernel/blocks/dbtc/DbtcMain.cpp
[5 Aug 2009 12:59] Jonas Oreland
pushed to 6.2.19, 6.3.26 and 7.0.7
[6 Aug 2009 8:13] Jon Stephens
Documented bugfix in the NDB-6.2.19, 6.3.26, and 7.0.7 changelogs as follows:

    Full table scans failed with more than 21 fragments. 

    The number of fragments is the number of data nodes, times 8
    (that is, MAX_FRAG_PER_NODE), divided by the number of replicas. 
    Thus, when NoOfReplicas = 1 at least 3 data nodes were required 
    to trigger this issue, and when NoOfReplicas = 2 at least 4 data 
    nodes were required to do so.