Bug #73015 | Node restarts while another node is already starting may crash \'master\' | ||
---|---|---|---|
Submitted: | 16 Jun 2014 11:32 | Modified: | 20 Jun 2014 15:19 |
Reporter: | Ole John Aske | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S1 (Critical) |
Version: | 7.3.6 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[16 Jun 2014 11:32]
Ole John Aske
[16 Jun 2014 13:55]
Ole John Aske
Posted by developer: Add some text from the commit comments possible being helpful when later documenting this fix: ............... A regression were introduced in the fix for bug#16007980 DATA NODE STUCK IN PHASE 1 WHEN OTHER NODE LOSES NETWORK That fix added the following code to Qmgr::failReportLab() + /** + * If any node is starting now (c_start.startNode != 0) + * sendPrepFailReq to that too + */ + if (c_start.m_startNode != 0) + { + jam(); + cfailedNodes[cnoFailedNodes++] = c_start.m_startNode; + c_start.reset(); + } However, we could already have been notified about the failure of the same node through any of the other 'channels' which handle failures or disconnects. Thus, we could end up with duplicates of the same NodeId in cfailedNodes[]. Later, this 'list of nodes' is converted into a BitMask used in the PREP_FAILREQ-signal, and converted back into a 'list of nodes' by Qmgr::execPREP_FAILREQ(). During this BitMask conversion, the duplicated NodeId is eliminated. However, **the 'noOfNodes' count is kept unchanged**. Thus we end up with a materialized 'list of nodes' where the size is of-by-one, and the last item contains garbage.
[20 Jun 2014 15:19]
Jon Stephens
Documented fix as follows in the NDB 7.1.32, 7.2.17, and 7.3.6 changelogs: Processing a NODE_FAILREP signal that contained an invalid node ID could cause a data node to fail. Regression of BUG#16007980. Closed. Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release. If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at http://dev.mysql.com/doc/en/installing-source.html