| Bug #58240 | Ndb : Multi-node failure handling in QMGR can send invalid signal | ||
|---|---|---|---|
| Submitted: | 16 Nov 2010 17:49 | Modified: | 6 Jan 2011 6:09 |
| Reporter: | Frazer Clement | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S3 (Non-critical) |
| Version: | mysql-5.1-telco-6.3 | OS: | Any |
| Assigned to: | Frazer Clement | CPU Architecture: | Any |
[16 Nov 2010 17:53]
Frazer Clement
Patch highlighting interesting areas
Attachment: bug58240.patch (text/x-patch), 1.32 KiB.
[19 Nov 2010 12:46]
Jonas Oreland
is there a fix coming ?
[13 Dec 2010 10:02]
Frazer Clement
Proposed patch against 6.3
Attachment: bug58240.patch (text/x-patch), 2.30 KiB.
[13 Dec 2010 10:03]
Frazer Clement
Bug is more serious for larger node count clusters, where multi node failures are more likely, and more likely to leave enough nodes for a viable surviving cluster. In this scenario, this bug could kill enough of the surviving nodes to make cluster survival impossible.
[13 Dec 2010 14:49]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/126654 3362 Frazer Clement 2010-12-13 Bug#58240 Ndb : Multi-node failure handling in QMGR can send invalid signal Fix multi-node failure handling code to avoid invalid signal construction.
[13 Dec 2010 15:50]
Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.21 (revid:frazer@mysql.com-20101213153450-5czlh089vpqfo7v1) (version source revid:frazer@mysql.com-20101213153450-5czlh089vpqfo7v1) (merge vers: 5.1.51-ndb-7.0.21) (pib:23)
[13 Dec 2010 15:55]
Bugs System
Pushed into mysql-5.1-telco-6.3 5.1.51-ndb-6.3.40 (revid:frazer@mysql.com-20101213152410-ozb61hffqx17fkex) (version source revid:frazer@mysql.com-20101213144826-vg323kt6tlb7656h) (merge vers: 5.1.51-ndb-6.3.40) (pib:23)
[20 Dec 2010 9:55]
Jonas Oreland
pushed to 6.3.40, 7.0.21 and 7.1.10
[6 Jan 2011 6:09]
Jon Stephens
Documented fix in the NDB-6.3.40, 7.0.21, and 7.1.10 changelogs as follows:
When handling failures of multiple data nodes, an error in the
construction of internal signals could cause the cluster's
remaining nodes to crash. This issue was most likely to affect
clusters with large numbers of data nodes.
Closed.

Description: When testing multi-node failure scenarios, noted problem with Failed nodeid received in signal PREP_FAILREQ. Specifically there is a junk nodeid. Adding more assertions reveals that the root of the problem is in CLOSE_COMCONF handling code in QMGR which stores the same nodeid in the c_failedNodes array twice. This is then used to build a bitmap when passed to receivers of PREP_FAILREQ and they read junk as the bitmap has less set bits than the NumberOfFailedNodes indicates. Code is not commented, but guess that : Node receives PREP_FAILREQ from president for node(s) A,B Node sends CLOSE_COMREQ to CMVMI ... Node receives CLOSE_COMCONF from CMVMI Node checks whether CLOSE_COMCONF included all failed nodes If so, send PREP_FAILCONF to president If not, send PREP_FAILREF to president - there's more failures for it to commit. 'all failed nodes' is normally maintained by FAIL_REP handling. It's not clear that this bug is serious, as generally it is seen in a situation where the cluster is likely to fail in any case (too many nodes have failed). How to repeat: Multi-node simultaneous failure situation (suggest 4 or more nodes). Surviving nodes processing asynchronous failure notifications. Sometimes QMGR asserts. Suggested fix: There's a dubious looking loop in Qmgr::execCLOSE_COMCONF() which starts from 1 for no known reason : if (tprepFailConf == ZFALSE) { jam(); for (Tindex = 1; Tindex < MAX_NDB_NODES; Tindex++) { cfailedNodes[Tindex] = cprepFailedNodes[Tindex]; }//for This will copy prepFailedNodes to cfailedNodes, except for the first entry! As far as I can see, this should start at 0. Other dubious code here : cprepFailedNodes array overwritten with contents of CLOSE_COMREQ bitmask, and new size stored in local arrayIndex. Then, later, cprepFailedNodes updated using cnoPrepFailedNodes as an index which appears unrelated to arrayIndex. Suggest : 1) Get agreement of semantics of this code 2) Re-implement with comments.