Bug #116216 restart bug in mysql NDB cluster
Submitted: 25 Sep 1:43 Modified: 25 Sep 8:53
Reporter: CunDi Fang Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Cluster: Cluster/J Severity:S2 (Serious)
Version:8.0.35-cluster MySQL Cluster Community S OS:Any
Assigned to: CPU Architecture:Any

[25 Sep 1:43] CunDi Fang
Description:
Attempting to restart the entire cluster in the middle of a node restart will reveal that it can't be processed, the entire cluster won't start up properly, and eventually, all the data nodes will crash and need to be restarted.

How to repeat:
I have a 1 admin node, 4 data node 2 replica 2 slice NDB cluster. I try to reboot one of the data nodes, and while that node is trying to rejoin the cluster, rebooting the entire cluster reveals that all of the data nodes are stuck in repeated attempts to reboot and connect, none of which work, and all of which eventually crash.

The log files are recorded as follows:
```
2024-09-23 03:57:44 [MgmtSrvr] INFO     -- Node 4: Node shutdown initiated
2024-09-23 03:57:52 [MgmtSrvr] INFO     -- Node 4: Suma: initiate handover for shutdown with nodes 0000000000000000000000000000000000000020 GCI: 5043627
2024-09-23 03:57:52 [MgmtSrvr] INFO     -- Node 4: Suma: handover to node 5 gci: 5043627 buckets: 00000001 (2)
2024-09-23 03:57:58 [MgmtSrvr] INFO     -- Node 4: Suma: handover complete
2024-09-23 03:57:59 [MgmtSrvr] INFO     -- Node 4 disconnected in recv with errnum: 104 in state: 0
2024-09-23 03:57:59 [MgmtSrvr] ALERT    -- Node 5: Node 4 Disconnected
2024-09-23 03:57:59 [MgmtSrvr] INFO     -- Node 2: Communication to Node 4 closed
2024-09-23 03:57:59 [MgmtSrvr] INFO     -- Node 5: Communication to Node 4 closed
2024-09-23 03:57:59 [MgmtSrvr] ALERT    -- Node 2: Node 4 Disconnected
2024-09-23 03:57:59 [MgmtSrvr] INFO     -- Node 3: Communication to Node 4 closed
2024-09-23 03:57:59 [MgmtSrvr] ALERT    -- Node 1: Node 4 Disconnected
2024-09-23 03:57:59 [MgmtSrvr] ALERT    -- Node 2: Arbitration check won - node group majority
2024-09-23 03:57:59 [MgmtSrvr] INFO     -- Node 2: President restarts arbitration thread [state=6]
2024-09-23 03:57:59 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=Initial state,NEW=Node failed, fail handling ongoing
2024-09-23 03:57:59 [MgmtSrvr] INFO     -- Node 5: NR Status: node=4,OLD=Initial state,NEW=Node failed, fail handling ongoing
2024-09-23 03:57:59 [MgmtSrvr] INFO     -- Node 3: NR Status: node=4,OLD=Initial state,NEW=Node failed, fail handling ongoing
2024-09-23 03:57:59 [MgmtSrvr] INFO     -- Node 4: Node shutdown completed.
2024-09-23 03:57:59 [MgmtSrvr] ALERT    -- Node 3: Node 4 Disconnected
2024-09-23 03:58:00 [MgmtSrvr] INFO     -- Alloc node id 4 rejected with error code 1703, will retry
2024-09-23 03:58:00 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=Node failed, fail handling ongoing,NEW=Node failure handling complete
2024-09-23 03:58:00 [MgmtSrvr] INFO     -- Node 5: NR Status: node=4,OLD=Node failed, fail handling ongoing,NEW=Node failure handling complete
2024-09-23 03:58:00 [MgmtSrvr] INFO     -- Node 3: NR Status: node=4,OLD=Node failed, fail handling ongoing,NEW=Node failure handling complete
2024-09-23 03:58:01 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=Node failure handling complete,NEW=Allocated node id
2024-09-23 03:58:01 [MgmtSrvr] INFO     -- Alloc node id 4 succeeded
2024-09-23 03:58:01 [MgmtSrvr] INFO     -- Nodeid 4 allocated for NDB at 192.192.10.11
2024-09-23 03:58:01 [MgmtSrvr] INFO     -- Node 1: Node 4 Connected
2024-09-23 03:58:01 [MgmtSrvr] INFO     -- Node 4: Buffering maximum epochs 100
2024-09-23 03:58:01 [MgmtSrvr] INFO     -- Node 4: Start phase 0 completed (system restart)
2024-09-23 03:58:01 [MgmtSrvr] INFO     -- Node 4: Communication to Node 2 opened
2024-09-23 03:58:01 [MgmtSrvr] INFO     -- Node 4: Communication to Node 3 opened
2024-09-23 03:58:01 [MgmtSrvr] INFO     -- Node 4: Communication to Node 5 opened
2024-09-23 03:58:01 [MgmtSrvr] INFO     -- Node 4: Waiting 30 sec for nodes 2, 3 and 5 to connect, nodes [ all: 2, 3, 4 and 5 connected: 4 no-wait:  ]
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 5: Communication to Node 4 opened
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 2: Communication to Node 4 opened
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 4: Node 2 Connected
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 2: Node 4 Connected
2024-09-23 03:58:03 [MgmtSrvr] WARNING  -- Node 3: Received request to incorporate node 4, while error handling has not yet completed
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 4: CM_REGCONF president = 2, own Node = 4, our dynamic id = 0/5
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 5: Node 4 Connected
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 4: Node 5 Connected
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 3: Communication to Node 4 opened
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 3: Node 4 Connected
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 4: Node 3 Connected
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 2: Node 4: API mysql-8.0.35 ndb-8.0.35
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 3: Node 4: API mysql-8.0.35 ndb-8.0.35
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 5: Node 4: API mysql-8.0.35 ndb-8.0.35
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=Allocated node id,NEW=Included in heartbeat protocol
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 4: Node 2: API mysql-8.0.35 ndb-8.0.35
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 4: Node 3: API mysql-8.0.35 ndb-8.0.35
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 4: Node 5: API mysql-8.0.35 ndb-8.0.35
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 4: Start phase 1 completed (system restart)
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 2: Start node: 4 using node restart
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=Included in heartbeat protocol,NEW=NDBCNTR master permitted us
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 4: Start phase 2 completed (node restart)
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 2: DICT: locked by node 4 for NodeRestart
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 5: NR Status: node=4,OLD=Node failure handling complete,NEW=All nodes permitted us
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 3: NR Status: node=4,OLD=Node failure handling complete,NEW=All nodes permitted us
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=NDBCNTR master permitted us,NEW=All nodes permitted us
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 4: Start phase 3 completed (node restart)
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 4: Start phase 4 completed (node restart)
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=All nodes permitted us,NEW=Wait for LCP complete to copy meta data
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=Wait for LCP complete to copy meta data,NEW=Copy meta data to start node
2024-09-23 03:58:03 [MgmtSrvr] INFO     -- Node 4: Receive arbitrator node 1 [ticket=bb28000145d8aadb]
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 2: Node restart completed copy of distribution information to Node 4
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: Starting to restore schema
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: Restore of schema complete
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 9 done (sys/def/8/ndb_index_stat_sample_x1)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 12 done (sys/def/11/PRIMARY)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 15 done (sys/def/14/PRIMARY)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 16 done (sys/def/14/column5)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 17 done (sys/def/14/column5$unique)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 18 done (sys/def/14/column6)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 19 done (sys/def/14/column6$unique)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 20 done (sys/def/14/column5_2)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 21 done (sys/def/14/column5_2$unique)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 23 done (sys/def/22/PRIMARY)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 24 done (sys/def/22/column3)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 25 done (sys/def/22/column3$unique)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 26 done (sys/def/22/column3_2)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 27 done (sys/def/22/column3_2$unique)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 30 done (sys/def/29/PRIMARY)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 33 done (sys/def/32/PRIMARY)
2024-09-23 03:58:04 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 34 done (sys/def/32/column3)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 35 done (sys/def/32/column3$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 36 done (sys/def/32/column8)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 37 done (sys/def/32/column8$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 38 done (sys/def/32/column4)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 39 done (sys/def/32/column4$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 40 done (sys/def/32/column5)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 41 done (sys/def/32/column5$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 42 done (sys/def/32/column3_2)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 43 done (sys/def/32/column3_2$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 46 done (sys/def/45/PRIMARY)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 47 done (sys/def/45/column2)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 48 done (sys/def/45/column2$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 49 done (sys/def/45/column3)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 50 done (sys/def/45/column3$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 51 done (sys/def/45/column4)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 52 done (sys/def/45/column4$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 53 done (sys/def/45/column2_2)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 54 done (sys/def/45/column2_2$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 57 done (sys/def/56/PRIMARY)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 59 done (sys/def/58/PRIMARY)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 60 done (sys/def/58/column9)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 61 done (sys/def/58/column9$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 62 done (sys/def/58/column3)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 63 done (sys/def/58/column3$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 64 done (sys/def/58/column7)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 65 done (sys/def/58/column7$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 66 done (sys/def/58/column3_2)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 67 done (sys/def/58/column3_2$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 69 done (sys/def/68/PRIMARY)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 70 done (sys/def/68/column2)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 71 done (sys/def/68/column2$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 72 done (sys/def/68/column6)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 73 done (sys/def/68/column6$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 74 done (sys/def/68/column2_2)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 75 done (sys/def/68/column2_2$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 84 done (sys/def/83/PRIMARY)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 86 done (sys/def/85/PRIMARY)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 87 done (sys/def/85/column2)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 88 done (sys/def/85/column2$unique)
2024-09-23 03:58:05 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 89 done (sys/def/85/column2_2)
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 4: DICT: activate index 90 done (sys/def/85/column2_2$unique)
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 2: Node restart completed copy of dictionary information to Node 4
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=Copy meta data to start node,NEW=Include node in LCP/GCP protocols
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 3: NR Status: node=4,OLD=All nodes permitted us,NEW=Include node in LCP/GCP protocols
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 5: NR Status: node=4,OLD=All nodes permitted us,NEW=Include node in LCP/GCP protocols
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=Include node in LCP/GCP protocols,NEW=Restore fragments ongoing
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 4: Node restart starting to copy the fragments to Node 4
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 4: Restore Database Off-line Starting on node 4
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 4: Node: 4 StartLog: [GCI Keep: 5043574 LastCompleted: 5043623 NewestRestorable: 5043629]
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 4: LDM instance 0: Restored LCP : 86 fragments, 763 rows, 87 millis, 8770 rows/s
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 4: Node Start completed restore of LCP id: 2560
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=Restore fragments ongoing,NEW=Undo Disk data ongoing
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=Undo Disk data ongoing,NEW=Execute REDO logs ongoing
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=Execute REDO logs ongoing,NEW=Build indexes ongoing
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 4: Restore Database Off-line Completed on node 4
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 4: Bring Database On-line Starting on node 4
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=Build indexes ongoing,NEW=Synchronize start node with live nodes
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 3: NR Status: node=4,OLD=Include node in LCP/GCP protocols,NEW=Synchronize start node with live nodes
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 5: NR Status: node=4,OLD=Include node in LCP/GCP protocols,NEW=Synchronize start node with live nodes
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 4: Node restart completed copying the fragments to Node 4
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 4: Bring Database On-line Completed on node 4
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 4: Starting REDO logging on node 4
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 4: LDM(0): CopyFrag complete. 156 frags, +0/-31098 rows, 1741488 bytes/163 ms 10683975 bytes/s.
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 4: LDM(0): Completed LCP, #frags = 86 #records = 439, #bytes = 31132
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 2: Make On-line Database recoverable by waiting for LCP Starting on node 4, LCP id 2561
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 2: NR Status: node=4,OLD=Synchronize start node with live nodes,NEW=Wait LCP to ensure durability
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 3: NR Status: node=4,OLD=Synchronize start node with live nodes,NEW=Wait LCP to ensure durability
2024-09-23 03:58:06 [MgmtSrvr] INFO     -- Node 5: NR Status: node=4,OLD=Synchronize start node with live nodes,NEW=Wait LCP to ensure durability
2024-09-23 03:58:07 [MgmtSrvr] INFO     -- Node 2: Cluster shutdown initiated
2024-09-23 03:58:07 [MgmtSrvr] INFO     -- Node 3: Cluster shutdown initiated
2024-09-23 03:58:07 [MgmtSrvr] INFO     -- Node 5: Cluster shutdown initiated
2024-09-23 03:58:08 [MgmtSrvr] INFO     -- Node 4 disconnected in recv with errnum: 104 in state: 0
2024-09-23 03:58:08 [MgmtSrvr] INFO     -- Node 4: Node shutdown completed.
2024-09-23 03:58:08 [MgmtSrvr] ALERT    -- Node 3: Node 4 Disconnected
2024-09-23 03:58:08 [MgmtSrvr] INFO     -- Node 2: Communication to Node 4 closed
2024-09-23 03:58:08 [MgmtSrvr] INFO     -- Node 3: Communication to Node 4 closed
2024-09-23 03:58:08 [MgmtSrvr] INFO     -- Node 5: Communication to Node 4 closed
2024-09-23 03:58:08 [MgmtSrvr] ALERT    -- Node 1: Node 4 Disconnected
2024-09-23 03:58:08 [MgmtSrvr] ALERT    -- Node 2: Arbitration check won - node group majority
2024-09-23 03:58:08 [MgmtSrvr] INFO     -- Node 2: President restarts arbitration thread [state=6]
2024-09-23 03:58:08 [MgmtSrvr] INFO     -- Node 2: Removed lock for node 4
2024-09-23 03:58:08 [MgmtSrvr] INFO     -- Node 2: DICT: remove lock by failed node 4 for NodeRestart
2024-09-23 03:58:08 [MgmtSrvr] INFO     -- Node 2: DICT: unlocked by node 4 for NodeRestart
2024-09-23 03:58:08 [MgmtSrvr] ALERT    -- Node 5: Node 4 Disconnected
2024-09-23 03:58:08 [MgmtSrvr] ALERT    -- Node 2: Node 4 Disconnected
2024-09-23 03:58:08 [MgmtSrvr] INFO     -- Node 2: Local checkpoint 2561 started. Keep GCI = 5043590 oldest restorable GCI = 5043590
2024-09-23 03:58:11 [MgmtSrvr] INFO     -- Node 3: Communication to Node 4 opened
2024-09-23 03:58:12 [MgmtSrvr] INFO     -- Node 5: Communication to Node 4 opened
2024-09-23 03:58:12 [MgmtSrvr] INFO     -- Node 2: Communication to Node 4 opened
2024-09-23 03:58:12 [MgmtSrvr] INFO     -- Node 2: LDM(0): Completed LCP, #frags = 86 #records = 134, #bytes = 9212
2024-09-23 03:58:12 [MgmtSrvr] INFO     -- Node 3: LDM(0): Completed LCP, #frags = 86 #records = 134, #bytes = 9212
2024-09-23 03:58:12 [MgmtSrvr] INFO     -- Node 5: LDM(0): Completed LCP, #frags = 86 #records = 192, #bytes = 11508
2024-09-23 03:58:12 [MgmtSrvr] INFO     -- Node 2: Local checkpoint 2561 completed
2024-09-23 03:58:14 [MgmtSrvr] INFO     -- Node 3 disconnected in recv with errnum: 104 in state: 0
2024-09-23 03:58:14 [MgmtSrvr] INFO     -- Node 3: Node shutdown completed.
2024-09-23 03:58:14 [MgmtSrvr] INFO     -- Node 2 disconnected in recv with errnum: 104 in state: 0
2024-09-23 03:58:14 [MgmtSrvr] INFO     -- Node 5 disconnected in recv with errnum: 104 in state: 0
2024-09-23 03:58:14 [MgmtSrvr] INFO     -- Node 2: Node shutdown completed.
2024-09-23 03:58:14 [MgmtSrvr] ALERT    -- Node 1: Node 3 Disconnected
2024-09-23 03:58:14 [MgmtSrvr] INFO     -- Node 5: Node shutdown completed.
2024-09-23 03:58:14 [MgmtSrvr] ALERT    -- Node 1: Node 2 Disconnected
2024-09-23 03:58:14 [MgmtSrvr] ALERT    -- Node 1: Node 5 Disconnected
2024-09-23 03:58:15 [MgmtSrvr] INFO     -- Nodeid 2 allocated for NDB at 192.192.10.9
2024-09-23 03:58:15 [MgmtSrvr] INFO     -- Node 1: Node 2 Connected
2024-09-23 03:58:15 [MgmtSrvr] INFO     -- Nodeid 3 allocated for NDB at 192.192.10.10
2024-09-23 03:58:15 [MgmtSrvr] INFO     -- Node 2: Buffering maximum epochs 100
2024-09-23 03:58:15 [MgmtSrvr] INFO     -- Node 1: Node 3 Connected
2024-09-23 03:58:15 [MgmtSrvr] INFO     -- Node 2: Start phase 0 completed (system restart)
2024-09-23 03:58:15 [MgmtSrvr] INFO     -- Node 2: Communication to Node 3 opened
2024-09-23 03:58:15 [MgmtSrvr] INFO     -- Node 2: Communication to Node 4 opened
2024-09-23 03:58:15 [MgmtSrvr] INFO     -- Node 2: Communication to Node 5 opened
2024-09-23 03:58:15 [MgmtSrvr] INFO     -- Node 2: Waiting 30 sec for nodes 3, 4 and 5 to connect, nodes [ all: 2, 3, 4 and 5 connected: 2 no-wait:  ]
2024-09-23 03:58:15 [MgmtSrvr] INFO     -- Alloc node id 4 rejected, no new president yet
2024-09-23 03:58:15 [MgmtSrvr] INFO     -- Nodeid 4 allocated for NDB at 192.192.10.11
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 3: Buffering maximum epochs 100
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 3: Start phase 0 completed (system restart)
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 3: Communication to Node 2 opened
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 3: Communication to Node 4 opened
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 3: Communication to Node 5 opened
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 3: Waiting 30 sec for nodes 2, 4 and 5 to connect, nodes [ all: 2, 3, 4 and 5 connected: 3 no-wait:  ]
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 1: Node 4 Connected
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Alloc node id 5 rejected, no new president yet
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Nodeid 5 allocated for NDB at 192.192.10.12
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 3: Node 2 Connected
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 2: Node 3 Connected
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 2: Waiting 29 sec for nodes 4 and 5 to connect, nodes [ all: 2, 3, 4 and 5 connected: 2 and 3 no-wait:  ]
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 4: Buffering maximum epochs 100
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 1: Node 5 Connected
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 2 disconnected in recv with errnum: 104 in state: 0
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 3 disconnected in recv with errnum: 104 in state: 0
2024-09-23 03:58:16 [MgmtSrvr] ALERT    -- Node 1: Node 2 Disconnected
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 4 disconnected in recv with errnum: 104 in state: 0
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 3: Node shutdown completed.
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 2: Node shutdown completed.
2024-09-23 03:58:16 [MgmtSrvr] INFO     -- Node 5: Buffering maximum epochs 100
2024-09-23 03:58:17 [MgmtSrvr] INFO     -- Node 4: Node shutdown completed.
2024-09-23 03:58:17 [MgmtSrvr] ALERT    -- Node 1: Node 3 Disconnected
2024-09-23 03:58:17 [MgmtSrvr] ALERT    -- Node 1: Node 4 Disconnected
2024-09-23 03:58:17 [MgmtSrvr] INFO     -- Node 5: Start phase 0 completed (system restart)
```

The setup of my cluster is as follows:
```
[NDBD DEFAULT]
NoOfReplicas =2
DataMemory = 512M
IndexMemory = 64M

[NDB_MGMD]
NodeId=1
hostname =192.192.10.8
datadir =/var/lib/mysql-cluster

[NDBD]
NodeId =2
hostname =192.192.10.9
datadir =/usr/local/mysql-cluster/data
[NDBD]
NodeId =3
hostname =192.192.10.10
datadir =/usr/local/mysql-cluster/data
[NDBD]
NodeId =4
hostname =192.192.10.11
datadir =/usr/local/mysql-cluster/data
[NDBD]
NodeId =5
hostname =192.192.10.12
datadir =/usr/local/mysql-cluster/data

[mysqld]
NodeId =6
hostname =192.192.10.9
[mysqld]
NodeId =7
hostname =192.192.10.10
[mysqld]
NodeId =8
hostname =192.192.10.11
[mysqld]
NodeId =9
hostname =192.192.10.12
```

Suggested fix:
When attempting a restart, it is best to add a global lock that describes whether a node is currently undergoing a restart operation as a way to cope with the concurrent restart requirements of the scenario.
[25 Sep 8:53] MySQL Verification Team
Hi,

Thanks for the report.
[25 Sep 13:35] MySQL Verification Team
Bug #116218 is marked as duplicate of this one