Bug #73843 Enhanced detection and dump of corrupt or unsupported messages
Submitted: 8 Sep 2014 23:48 Modified: 14 Oct 2014 13:43
Reporter: Mauritz Sundell Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S4 (Feature request)
Version:7.1 OS:Any
Assigned to: CPU Architecture:Any

[8 Sep 2014 23:48] Mauritz Sundell
Description:
There have been occurrences of corrupt messages to data nodes.  If they are detected as corrupt the sender are disconnected and arbitration will take place.  But in case corruption is not detected a bad signal can be delivered to a block that abort the data node which in combination with disconnecting nodes can take cluster down.

To lower risk for cluster down, more checks should be added.

How to repeat:
Inject corruption

Suggested fix:
Add checks that only supported byte order and no compression are used.
Also check next message if possible.
[14 Oct 2014 13:43] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.
Documented as follows in the NDB 7.1.34, 7.2.19, 7.3.8, and 7.4.2 changelogs:

    Corrupted messages to data nodes sometimes went undetected,
    causing a bad signal to be delivered to a block which aborted
    the data node. This failure in combination with disconnecting
    nodes could in turn cause the entire cluster to shut down.

    To keep this from happening, additional checks are now made when
    unpacking signals received over TCP, including checks for byte
    order, compression flag (which must not be used), and the length
    of the next message in the receive buffer (if there is one).

    Whenever two consecutive unpacked messages fail the checks just
    described, the current message is assumed to be corrupted. In
    this case, the transporter is marked as having bad data and no
    more unpacking of messages occurs until the transporter is
    reconnected. In addition, an entry is written to the cluster log
    containing the error as well as a hex dump of the corrupted
    message.

Closed. 

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html