MySQL Bugs: #73843: Enhanced detection and dump of corrupt or unsupported messages

Bug #73843	Enhanced detection and dump of corrupt or unsupported messages
Submitted:	8 Sep 2014 23:48	Modified:	14 Oct 2014 13:43
Reporter:	Mauritz Sundell	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S4 (Feature request)
Version:	7.1	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
There have been occurrences of corrupt messages to data nodes.  If they are detected as corrupt the sender are disconnected and arbitration will take place.  But in case corruption is not detected a bad signal can be delivered to a block that abort the data node which in combination with disconnecting nodes can take cluster down.

To lower risk for cluster down, more checks should be added.

How to repeat:
Inject corruption

Suggested fix:
Add checks that only supported byte order and no compression are used.
Also check next message if possible.

Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.
Documented as follows in the NDB 7.1.34, 7.2.19, 7.3.8, and 7.4.2 changelogs:

    Corrupted messages to data nodes sometimes went undetected,
    causing a bad signal to be delivered to a block which aborted
    the data node. This failure in combination with disconnecting
    nodes could in turn cause the entire cluster to shut down.

    To keep this from happening, additional checks are now made when
    unpacking signals received over TCP, including checks for byte
    order, compression flag (which must not be used), and the length
    of the next message in the receive buffer (if there is one).

    Whenever two consecutive unpacked messages fail the checks just
    described, the current message is assumed to be corrupted. In
    this case, the transporter is marked as having bad data and no
    more unpacking of messages occurs until the transporter is
    reconnected. In addition, an entry is written to the cluster log
    containing the error as well as a hex dump of the corrupted
    message.

Closed. 

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html