MySQL Bugs: #46490: Full table scan hangs with more than 21 fragments

Bug #46490	Full table scan hangs with more than 21 fragments
Submitted:	31 Jul 2009 12:12	Modified:	6 Aug 2009 8:13
Reporter:	Oli Sennhauser	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	mysql-5.1-telco-6.2	OS:	Any
Assigned to:	Frazer Clement	CPU Architecture:	Any
Tags:	cluster, fragment, full, hang, scan, table

Description:
MySQL cluster hangs on full table scans and gets error at the end:

ndb_select_all cache_update -d vm_vs > cache_update.select
ERROR: 4008 Receive from NDB failed
           Status: Unknown result, Classification: Unknown result error
           File: select_all.cpp (Line: 442)

How to repeat:
There seems to be a rule not to create > 4 fragments/node so you need > 4 nodes to reproduce
Then put data in and try some scans
Sooner or later should get a hang
In debug, the 'unhandled signals after execute()' results in a node crash

Suggested fix:
Frazer is looking...

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/79988

2960 Frazer Clement	2009-08-04
      Bug#46490 : Full table scan hangs with more than 21 fragments
      modified:
        storage/ndb/include/kernel/signaldata/ScanTab.hpp
        storage/ndb/src/kernel/blocks/dbtc/DbtcMain.cpp

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/79991

2960 Frazer Clement	2009-08-04
      Bug#46490 : Full table scan hangs with more than 21 fragments
      modified:
        storage/ndb/include/kernel/signaldata/ScanTab.hpp
        storage/ndb/src/kernel/blocks/dbtc/DbtcMain.cpp

pushed to 6.2.19, 6.3.26 and 7.0.7

Documented bugfix in the NDB-6.2.19, 6.3.26, and 7.0.7 changelogs as follows:

    Full table scans failed with more than 21 fragments. 

    The number of fragments is the number of data nodes, times 8
    (that is, MAX_FRAG_PER_NODE), divided by the number of replicas. 
    Thus, when NoOfReplicas = 1 at least 3 data nodes were required 
    to trigger this issue, and when NoOfReplicas = 2 at least 4 data 
    nodes were required to do so.