MySQL Bugs: #116425: SUMA switch in NDB cluster

Bug #116425	SUMA switch in NDB cluster
Submitted:	19 Oct 2024 17:59	Modified:	13 Nov 2024 11:31
Reporter:	CunDi Fang	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	8.0.35-cluster MySQL Cluster Community S	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
When SUMA switches ownership of bucket 1 to another node, the status changes to 100, followed immediately by a “shutdown handover” and “shutdown initiated”. This causes the node to terminate and exit the cluster, triggering troubleshooting on other nodes.

How to repeat:
This is what I found while checking the logs for a 4 node, 2 copy 2 slice, cluster of four data nodes including node2, node3, node4, and node5. While restarting the nodes, a strange failure occurred and was troubleshot, fortunately he ended up error-handling successfully.

node2,node3,node5 are error handling.
```
2024-10-18 21:37:07 [ndbd] INFO     -- Node 2 has completed its restart
2024-10-18 21:37:07 [ndbd] INFO     -- Node 3 has completed its restart
2024-10-18 21:37:07 [ndbd] INFO     -- Node 5 has completed its restart
2024-10-18 21:37:07 [ndbd] INFO     -- Node 4 has completed its restart
2024-10-18 21:37:08 [ndbd] INFO     -- Allocate event buffering page chunk in SUMA, 16 pages, first page ref = 7060
2024-10-18 21:37:37 [ndbd] INFO     -- prepare to takeover bucket: 1
2024-10-18 21:37:41 [ndbd] INFO     -- 22826/0 (22825/4294967295) switchover complete bucket 1 state: 200
2024-10-18 21:37:41 [ndbd] INFO     -- shutdown handover takeover
2024-10-18 21:37:43 [ndbd] INFO     -- Transporter 3 to node 4 disconnected in state: 0
2024-10-18 21:37:43 [ndbd] INFO     -- findNeighbours from: 5587 old (left: 5 right: 3) new (5 3)
2024-10-18 21:37:43 [ndbd] ALERT    -- Arbitration check won - node group majority
2024-10-18 21:37:43 [ndbd] INFO     -- President restarts arbitration thread [state=6]
2024-10-18 21:37:43 [ndbd] INFO     -- NR Status: node=4,OLD=Initial state,NEW=Node failed, fail handling ongoing
2024-10-18 21:37:43 [ndbd] INFO     -- DBTC 0: Started failure handling for node 4
2024-10-18 21:37:43 [ndbd] INFO     -- DBTC 0: Starting take over of node 4
2024-10-18 21:37:43 [ndbd] INFO     -- DBTC 0: Step NF_CHECK_SCAN completed, failure handling for node 4 waiting for NF_TAKEOVER, NF_CHECK_TRANSACTION, NF_BLOCK_HANDLE.
2024-10-18 21:37:43 [ndbd] INFO     -- DBTC 0: Step NF_BLOCK_HANDLE completed, failure handling for node 4 waiting for NF_TAKEOVER, NF_CHECK_TRANSACTION.
2024-10-18 21:37:43 [ndbd] INFO     -- DBTC 0: GCP completion 22826/10 waiting for node failure handling (1) to complete. Seizing record for GCP.
2024-10-18 21:37:43 [ndbd] INFO     -- DBTC 0: Step NF_CHECK_TRANSACTION completed, failure handling for node 4 waiting for NF_TAKEOVER.
2024-10-18 21:37:43 [ndbd] INFO     -- DBTC 0: Completed take over of failed node 4
2024-10-18 21:37:43 [ndbd] INFO     -- DBTC 0: Step NF_TAKEOVER completed, failure handling for node 4 complete.
2024-10-18 21:37:43 [ndbd] INFO     -- DBTC 0: Completing GCP 22826/10 on node failure takeover completion.
2024-10-18 21:37:43 [ndbd] INFO     -- NR Status: node=4,OLD=Node failed, fail handling ongoing,NEW=Node failure handling complete
2024-10-18 21:37:43 [ndbd] INFO     -- Node 4 has completed node fail handling
2024-10-18 21:37:43 [ndbd] INFO     -- Alloc node id for node 4 succeeded
2024-10-18 21:37:43 [ndbd] INFO     -- NR Status: node=4,OLD=Node failure handling complete,NEW=Allocated node id
2024-10-18 21:37:44 [ndbd] INFO     -- Adjusting disk write speed bounds due to : Node restart ongoing
2024-10-18 21:37:50 [ndbd] INFO     -- Cluster shutdown durable gci : 22830
2024-10-18 21:37:51 [ndbd] INFO     -- The data node run-time environment has been stopped
2024-10-18 21:37:51 [ndbd] INFO     -- Shutdown initiated
2024-10-18 21:37:51 [ndbd] INFO     -- Shutdown completed - exiting
2024-10-18 21:37:51 [ndbd] INFO     -- Angel shutting down
2024-10-18 21:37:51 [ndbd] INFO     -- Node 2: Node shutdown completed.
2024-10-18 21:37:51 [ndbd] INFO     -- Angel pid: 119131 started child: 119132
2024-10-18 21:37:51 [ndbd] INFO     -- Normal start of data node using checkpoint and log info if existing
2024-10-18 21:37:51 [ndbd] INFO     -- Configuration fetched from '192.8.0.8:1186', generation: 1
2024-10-18 21:37:51 [ndbd] INFO     -- Changing directory to '/usr/local/mysql-cluster/data'
```

fail was on node4:
```
2024-10-18 21:37:04 [ndbd] INFO     -- LDM(0): index id 67 rebuild done
2024-10-18 21:37:04 [ndbd] INFO     -- LDM(0): index id 69 rebuild done
2024-10-18 21:37:04 [ndbd] INFO     -- LDM(0): index id 71 rebuild done
2024-10-18 21:37:04 [ndbd] INFO     -- LDM(0): index id 75 rebuild done
2024-10-18 21:37:04 [ndbd] INFO     -- LDM(0): index id 76 rebuild done
2024-10-18 21:37:04 [ndbd] INFO     -- LDM(0): index id 78 rebuild done
2024-10-18 21:37:04 [ndbd] INFO     -- LDM(0): index id 80 rebuild done
2024-10-18 21:37:04 [ndbd] INFO     -- LDM(0): We have completed restoring our fragments and executed REDO log and rebuilt ordered indexes
2024-10-18 21:37:04 [ndbd] INFO     -- Start phase 4 completed
2024-10-18 21:37:04 [ndbd] INFO     -- Phase 4 restored all fragments from local disk up to a consistent global checkpoint id
2024-10-18 21:37:04 [ndbd] INFO     -- Start NDB start phase 4
2024-10-18 21:37:04 [ndbd] INFO     -- NDB start phase 4 completed
2024-10-18 21:37:04 [ndbd] INFO     -- During cluster start/restart only master runs phase 5 of NDB start phases
2024-10-18 21:37:04 [ndbd] INFO     -- Report to master node our state and wait for master
2024-10-18 21:37:04 [ndbd] INFO     -- Master node 2 have reached completion of NDB start phase 5
2024-10-18 21:37:04 [ndbd] INFO     -- Start NDB start phase 6
2024-10-18 21:37:04 [ndbd] INFO     -- NDB start phase 6 completed
2024-10-18 21:37:04 [ndbd] INFO     -- Start phase 5 completed
2024-10-18 21:37:04 [ndbd] INFO     -- Phase 5 waited for local checkpoint to complete
2024-10-18 21:37:04 [ndbd] INFO     -- Start phase 6 completed
2024-10-18 21:37:04 [ndbd] INFO     -- Phase 6 updated blocks about that we've now reached the started state.
2024-10-18 21:37:04 [ndbd] INFO     -- Activating bucket 1 in SUMA
2024-10-18 21:37:04 [ndbd] INFO     -- Start phase 7 completed
2024-10-18 21:37:04 [ndbd] INFO     -- Phase 7 mainly activated the asynchronous change events process, and some other background processes
2024-10-18 21:37:04 [ndbd] INFO     -- Start NDB start phase 7
2024-10-18 21:37:04 [ndbd] INFO     -- Foreign Key enabling Starting
2024-10-18 21:37:05 [ndbd] INFO     -- Foreign key enabling Completed
2024-10-18 21:37:05 [ndbd] INFO     -- NDB start phase 7 completed
2024-10-18 21:37:05 [ndbd] INFO     -- Start phase 8 completed
2024-10-18 21:37:05 [ndbd] INFO     -- Phase 8 enabled foreign keys and waited forall nodes to complete start up to this point
2024-10-18 21:37:05 [ndbd] INFO     -- Prepare arbitrator node 1 [ticket=d112585f6a11580b]
2024-10-18 21:37:07 [ndbd] INFO     -- Restart complete, updated local sysfile
2024-10-18 21:37:07 [ndbd] INFO     -- Start phase 9 completed
2024-10-18 21:37:07 [ndbd] INFO     -- Phase 9 enabled APIs to start connecting
2024-10-18 21:37:07 [ndbd] INFO     -- Start phase 50 completed
2024-10-18 21:37:07 [ndbd] INFO     -- Start phase 101 completed
2024-10-18 21:37:07 [ndbd] INFO     -- Phase 101 was used by SUMA to take over responsibility for sending some of the asynchronous change events
2024-10-18 21:37:07 [ndbd] INFO     -- Node started
2024-10-18 21:37:07 [ndbd] INFO     -- Node 2 has completed its restart
2024-10-18 21:37:07 [ndbd] INFO     -- Node 4 has completed its restart
2024-10-18 21:37:07 [ndbd] INFO     -- Node 3 has completed its restart
2024-10-18 21:37:07 [ndbd] INFO     -- Node 5 has completed its restart
2024-10-18 21:37:08 [ndbd] INFO     -- Allocate event buffering page chunk in SUMA, 16 pages, first page ref = 7060
2024-10-18 21:37:37 [ndbd] INFO     -- Suma: handover to node 2 gci: 22826 buckets: 00000002 (2)
2024-10-18 21:37:41 [ndbd] INFO     -- 22826/0 (22825/4294967295) switchover complete bucket 1 state: 100
2024-10-18 21:37:41 [ndbd] INFO     -- shutdown handover
2024-10-18 21:37:42 [ndbd] INFO     -- The data node run-time environment has been stopped
2024-10-18 21:37:42 [ndbd] INFO     -- Shutdown initiated
```

I will upload all the log files I have.

Suggested fix:
SUMA had some sort of issue when switching bucket 1, which caused node 4 to perform a shutdown operation.SUMA is responsible for managing asynchronous change events in the cluster, so it is possible that there was an exception in asynchronous event delivery or bucket switching.

Hi,

I am not 100% sure but looks to me SUMA is handling the shutdown and not causing it. Anyways I need a way to reproduce this. I am having problems reproducing this scenario you see in your log.

I double-checked the logs and I realized that I wasn't performing any node joins or anything like that that would cause the cluster state to change, I was performing sql transactions at the time. The statement is as follows:

```
time: 2024-10-18T21:36:57.534809Z
message to 1: SHOW TABLES in mytest4;
time: 2024-10-18T21:36:57.534810Z
message to 2: SHOW TABLES in mytest4;
time: 2024-10-18T21:36:57.534818Z
message to 3: SHOW TABLES in mytest4;
time: 2024-10-18T21:36:57.534831Z
message to 4: SHOW TABLES in mytest4;
time: 2024-10-18T21:36:57.535590Z
received buf_4: test0test1test2test3test4test5test6test7test8test9
time: 2024-10-18T21:36:57.535672Z
received buf_1: test0test1test2test3test4test5test6test7test8test9
time: 2024-10-18T21:36:57.535679Z
received buf_2: test0test1test2test3test4test5test6test7test8test9
time: 2024-10-18T21:36:57.535929Z
received buf_3: test0test1test2test3test4test5test6test7test8test9
metadata consisty right
 message to metadata: SELECT * FROM ndbinfo.config_nodes;
time: 2024-10-18T21:36:57.535994Z
message to 1: SELECT * FROM ndbinfo.config_nodes;
time: 2024-10-18T21:36:57.535997Z
message to 2: SELECT * FROM ndbinfo.config_nodes;
time: 2024-10-18T21:36:57.536003Z
message to 3: SELECT * FROM ndbinfo.config_nodes;
time: 2024-10-18T21:36:57.536029Z
message to 4: SELECT * FROM ndbinfo.config_nodes;
time: 2024-10-18T21:36:57.536984Z
received buf_3: res_got_len=0 error? or fail?
time: 2024-10-18T21:36:57.537021Z
received buf_4: res_got_len=0 error? or fail?
time: 2024-10-18T21:36:57.537091Z
received buf_1: res_got_len=0 error? or fail?
time: 2024-10-18T21:36:57.537275Z
received buf_2: res_got_len=0 error? or fail?
```

Starting with this one sql, all subsequent sqls failed to execute properly, so the point in time when the failure occurred could perhaps be a bit further back. To help you test, I'll provide sql error logs for all nodes

Hi,

Discussed with NDB team and they say it is not a bug. Logging could be improved but this is normal behavior.