Bug #111932 Scans of multi ranges can lose fragment locks for second and onwards ranges
Submitted: 31 Jul 2023 23:29 Modified: 1 Aug 2023 5:51
Reporter: Mikael Ronström Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:8.0.23 OS:Any
Assigned to: CPU Architecture:Any

[31 Jul 2023 23:29] Mikael Ronström
Description:
When performing a multi range scan we use the following code:
  if((tcConnectptr.p->primKeyLen > 0) && 
     (scanPtr->scanCompletedStatus != ZTRUE))
  {
    jam();
    /* Start next range scan...*/
    m_scan_direct_count++;
    continueAfterReceivingAllAiLab(signal, tcConnectptr);
    release_frag_access(prim_tab_fragptr.p);
    return;
  }

The problem is that the scan of the second and later ranges happens after
the return and not in the function continueAfterReceivingAllAiLab.

Thus the call to release_frag_access needs to be removed.

There will be other calls to release_frag_access handling the release
when the scan is done or needs to take a real-time break.

How to repeat:
Very difficult to reproduce, only happens in Query threads since the
LDM threads is protected since there is only one thread that can update
the ordered index and this is the LDM thread. Also need to update ordered
index at the exact time of the multi range scan. Plus need to perform
a multi range scan with at least a few ranges.

Suggested fix:
Remove line with call to release_frag_access in the function above:
accScanCloseConfLab
[1 Aug 2023 5:51] MySQL Verification Team
Hello Mikael,

Thank you for the report and feedback.

Sincerely,
Umesh