Bug #116423 Node started while node shutdown in NDB cluster
Submitted: 19 Oct 2024 17:17 Modified: 28 Oct 2024 14:47
Reporter: CunDi Fang Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:8.0.35-cluster MySQL Cluster Community S OS:Any
Assigned to: MySQL Verification Team CPU Architecture:Any

[19 Oct 2024 17:17] CunDi Fang
Description:
Here is the description in error log:

Current byte-offset of file-pointer is: 1067

Time: Friday 18 October 2024 - 20:45:21
Status: Temporary error, restart node
Message: Node started while node shutdown in progress. Please wait until shutdown complete before starting node (Restart error)
Error: 6101
Error data:
Error object: NDBCNTR (Line: 2045) 0x00000002
Program: ndbd
Pid: 103978
Version: mysql-8.0.40 ndb-8.0.40
Trace file name: ndb_4_trace.log.1
Trace file path: /usr/local/mysql-cluster/data/ndb_4_trace.log.1 [t1..t1]
***EOM***

How to repeat:
加入一个节点,然后在节点删除的过程中,尝试添加它,就会遇到类似的错误。好像是从底层的DICT开始出现问题的。

```
2024-10-18 20:43:35 [MgmtSrvr] INFO     -- Node 2: DICT: index 34 stats auto-update done
2024-10-18 20:43:36 [MgmtSrvr] INFO     -- Node 4: Node shutdown initiated
2024-10-18 20:43:36 [MgmtSrvr] INFO     -- Node 2: DICT: index 36 stats auto-update starting
2024-10-18 20:43:36 [MgmtSrvr] INFO     -- Node 2: index 36 stats version 2: scan frag: created 10 samples
2024-10-18 20:43:36 [MgmtSrvr] INFO     -- Node 2: DICT: index 36 stats auto-update done
2024-10-18 20:43:37 [MgmtSrvr] INFO     -- Node 2: DICT: index 38 stats auto-update starting
2024-10-18 20:43:37 [MgmtSrvr] INFO     -- Node 2: index 38 stats version 2: scan frag: created 8 samples
2024-10-18 20:43:37 [MgmtSrvr] INFO     -- Node 2: DICT: index 38 stats auto-update done
2024-10-18 20:43:38 [MgmtSrvr] INFO     -- Node 2: DICT: index 42 stats auto-update starting
2024-10-18 20:43:38 [MgmtSrvr] INFO     -- Node 5: index 42 stats version 3: scan frag: created 7 samples
2024-10-18 20:43:38 [MgmtSrvr] INFO     -- Node 2: DICT: index 42 stats auto-update done
2024-10-18 20:43:39 [MgmtSrvr] INFO     -- Node 2: DICT: index 45 stats auto-update starting
2024-10-18 20:43:39 [MgmtSrvr] INFO     -- Node 5: index 45 stats version 3: scan frag: created 10 samples
2024-10-18 20:43:39 [MgmtSrvr] INFO     -- Node 2: DICT: index 45 stats auto-update done
2024-10-18 20:43:40 [MgmtSrvr] INFO     -- Node 2: DICT: index 46 stats auto-update starting
2024-10-18 20:43:40 [MgmtSrvr] INFO     -- Node 2: index 46 stats version 2: scan frag: created 5 samples
2024-10-18 20:43:41 [MgmtSrvr] INFO     -- Node 2: DICT: index 46 stats auto-update done
2024-10-18 20:43:42 [MgmtSrvr] INFO     -- Node 2: DICT: index 48 stats auto-update starting
2024-10-18 20:43:42 [MgmtSrvr] WARNING  -- Node 4: index 48 stats version 0: clean new: error 280 line 2122
2024-10-18 20:43:42 [MgmtSrvr] WARNING  -- Node 2: DICT: index 48 stats auto-update error: 280
2024-10-18 20:43:43 [MgmtSrvr] INFO     -- Node 2: DICT: index 50 stats auto-update starting
2024-10-18 20:43:43 [MgmtSrvr] WARNING  -- Node 4: index 50 stats version 0: clean new: error 280 line 2122
2024-10-18 20:43:43 [MgmtSrvr] WARNING  -- Node 2: DICT: index 50 stats auto-update error: 280
2024-10-18 20:43:44 [MgmtSrvr] INFO     -- Node 2: DICT: index 54 stats auto-update starting
2024-10-18 20:43:44 [MgmtSrvr] WARNING  -- Node 4: index 54 stats version 0: clean new: error 280 line 2122
2024-10-18 20:43:44 [MgmtSrvr] WARNING  -- Node 2: DICT: index 54 stats auto-update error: 280
2024-10-18 20:43:44 [MgmtSrvr] INFO     -- Node 4: Suma: initiate handover for shutdown with nodes 0000000000000000000000000000000000000004 GCI: 21499
2024-10-18 20:43:44 [MgmtSrvr] INFO     -- Node 4: Suma: handover to node 2 gci: 21499 buckets: 00000002 (2)
2024-10-18 20:43:45 [MgmtSrvr] INFO     -- Node 2: DICT: index 55 stats auto-update starting
2024-10-18 20:43:45 [MgmtSrvr] WARNING  -- Node 4: index 55 stats version 0: clean new: error 280 line 2122
2024-10-18 20:43:45 [MgmtSrvr] WARNING  -- Node 2: DICT: index 55 stats auto-update error: 280
2024-10-18 20:43:46 [MgmtSrvr] INFO     -- Node 2: DICT: index 57 stats auto-update starting
2024-10-18 20:43:46 [MgmtSrvr] WARNING  -- Node 4: index 57 stats version 0: clean new: error 280 line 2122
2024-10-18 20:43:46 [MgmtSrvr] WARNING  -- Node 2: DICT: index 57 stats auto-update error: 280
2024-10-18 20:43:47 [MgmtSrvr] INFO     -- Node 2: DICT: index 59 stats auto-update starting
2024-10-18 20:43:47 [MgmtSrvr] WARNING  -- Node 4: index 59 stats version 0: clean new: error 280 line 2122
2024-10-18 20:43:47 [MgmtSrvr] WARNING  -- Node 2: DICT: index 59 stats auto-update error: 280
2024-10-18 20:43:48 [MgmtSrvr] INFO     -- Node 2: DICT: index 62 stats auto-update starting
2024-10-18 20:43:48 [MgmtSrvr] WARNING  -- Node 4: index 62 stats version 0: clean new: error 280 line 2122
2024-10-18 20:43:48 [MgmtSrvr] WARNING  -- Node 2: DICT: index 62 stats auto-update error: 280
```

I will upload all the log files

Suggested fix:
It is proposed to add a lock to manage the process of joining and deleting nodes.
[28 Oct 2024 14:47] MySQL Verification Team
This is not a bug. Error message should be pretty clear, while shutdown is in process node is trying to start so that start is aborted as expected.