Bug #105616 Saved corrupt version of COPY_FRAGREQ
Submitted: 17 Nov 2021 14:38 Modified: 11 Mar 2022 15:07
Reporter: Mikael Ronström Email Updates:
Status: Closed Impact on me:
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:8.0.23 OS:Any
Assigned to: CPU Architecture:Any

[17 Nov 2021 14:38] Mikael Ronström
When receiving COPY_FRAGREQ we sometimes put the signal into a queue.
We do this by copying the signal object into a stored object.
However we could sometimes use the signal object to send another signal
before this happens. This leads to saving a corrupt signal that is later
sent and causes a crash in the system restart.

How to repeat:
testNodeRestart -n ChangeNumLDMsNR T1 D2

Suggested fix:
Save a copy of the CopyFragReq object from the signal object
and let the pointer point to this static object rather than to
the signal object such as in execCREATE_TAB_REQ in DbtuxMeta.cpp.
[17 Nov 2021 15:52] MySQL Verification Team
Thanks for the report Mikael,

all best
[11 Mar 2022 15:07] Jon Stephens
Documented fix as follows in the NDB 8.0.30 changelog:

    After receiving a COPY_FRAGREQ signal, DBLQH sometimes places
    the signal in a queue by copying the signal object into a stored
    object. Problems could arise when this signal object was used to
    send another signal before the incoming signal was stored; this
    led to saving a corrupt signal that, when sent, prevented a
    system restart from completing. We fix this by using a static
    copy of the signal for storage and retrieval, rather than the
    original signal object.