Description:
The description in error log is:
Current byte-offset of file-pointer is: 2065
Time: Thursday 17 October 2024 - 17:16:20
Status: Ndbd file system error, restart node initial
Message: File not found (Ndbd file system inconsistency error, please report a bug)
Error: 2815
Error data: DBDICT: File system open failed during FsConnectRecord state 1. OS errno: 2
Error object: DBDICT (Line: 1318) 0x00000006
Program: ndbd
Pid: 100147
Version: mysql-8.0.40 ndb-8.0.40
Trace file name: ndb_3_trace.log.1
Trace file path: /usr/local/mysql-cluster/data/ndb_3_trace.log.1 [t1..t1]
***EOM***
Time: Thursday 17 October 2024 - 17:20:27
Status: Temporary error, restart node
Message: Another node failed during system restart, please investigate error(s) on other node(s) (Restart error)
Error: 2308
Error data: Node 2 disconnected in phase: 2
Error object: QMGR (Line: 3917) 0x00000002
Program: ndbd
Pid: 101045
Version: mysql-8.0.40 ndb-8.0.40
Trace file name: ndb_3_trace.log.2
Trace file path: /usr/local/mysql-cluster/data/ndb_3_trace.log.2 [t1..t1]
***EOM***
I should have a problem when trying to delete a node
How to repeat:
I am a 2 replica 2 slice cluster of 2 data nodes as node2, node3, node4, node5. then this error report is coming up in node3, node4, node5.
One log file describes it this way:
```
mysqld: Can't open shared library '/usr/local/mysql-cluster/lib/plugin/component_reference_cache.so' (errno: 3 /usr/local/mysql-cluster/lib/plugin/component_reference_cache.so: cannot open shared object file: No such file or directory)
mysqld: Cannot load component from specified URN: 'file://component_reference_cache'.
2024-10-17T17:19:00.588983Z 0 [Warning] [MY-011070] [Server] 'Disabling symbolic links using --skip-symbolic-links (or equivalent) is the default. Consider not using this option as it' is deprecated and will be removed in a future release.
2024-10-17T17:19:00.589124Z 0 [System] [MY-010116] [Server] /usr/local/bin/mysqld (mysqld 8.0.40-cluster) starting as process 101012
2024-10-17T17:19:00.589157Z 0 [ERROR] [MY-010338] [Server] Can't find error-message file '/usr/local/mysql-cluster/share/errmsg.sys'. Check error-message file location and 'lc-messages-dir' configuration directive.
2024-10-17T17:19:00.596558Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2024-10-17T17:19:00.627215Z 1 [ERROR] [MY-012574] [InnoDB] Unable to lock ./ibdata1 error: 11
2024-10-17T17:19:01.627386Z 1 [ERROR] [MY-012574] [InnoDB] Unable to lock ./ibdata1 error: 11
2024-10-17T17:19:02.627660Z 1 [ERROR] [MY-012574] [InnoDB] Unable to lock ./ibdata1 error: 11
2024-10-17T17:19:03.627860Z 1 [ERROR] [MY-012574] [InnoDB] Unable to lock ./ibdata1 error: 11
```
从manager上看是这样的,似乎当时在尝试数据同步:
```
2024-10-17 17:19:57 [MgmtSrvr] INFO -- Node 2: Node restart completed copy of distribution information to Node 5
2024-10-17 17:19:57 [MgmtSrvr] INFO -- Node 5: Starting to restore schema
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: Receive arbitrator node 1 [ticket=9f66559163fd997a]
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: Restore of schema complete
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 9 done (sys/def/8/ndb_index_stat_sample_x1)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 12 done (sys/def/11/PRIMARY)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 14 done (sys/def/13/PRIMARY)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 15 done (sys/def/13/column2)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 16 done (sys/def/13/column2$unique)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 17 done (sys/def/13/column3)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 18 done (sys/def/13/column3$unique)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 19 done (sys/def/13/column2_2)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 20 done (sys/def/13/column2_2$unique)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 22 done (sys/def/21/PRIMARY)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 23 done (sys/def/21/column4)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 24 done (sys/def/21/column4$unique)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 25 done (sys/def/21/column4_2)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 26 done (sys/def/21/column4_2$unique)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 29 done (sys/def/28/PRIMARY)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 30 done (sys/def/28/column3)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 31 done (sys/def/28/column3$unique)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 32 done (sys/def/28/column3_2)
2024-10-17 17:19:58 [MgmtSrvr] INFO -- Node 5: DICT: activate index 33 done (sys/def/28/column3_2$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 35 done (sys/def/34/PRIMARY)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 37 done (sys/def/36/PRIMARY)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 38 done (sys/def/36/column2)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 39 done (sys/def/36/column2$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 40 done (sys/def/36/column4)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 41 done (sys/def/36/column4$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 42 done (sys/def/36/column5)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 43 done (sys/def/36/column5$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 44 done (sys/def/36/column2_2)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 45 done (sys/def/36/column2_2$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 48 done (sys/def/47/PRIMARY)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 49 done (sys/def/47/column3)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 50 done (sys/def/47/column3$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 51 done (sys/def/47/column5)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 52 done (sys/def/47/column5$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 53 done (sys/def/47/column6)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 54 done (sys/def/47/column6$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 55 done (sys/def/47/column8)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 56 done (sys/def/47/column8$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 57 done (sys/def/47/column3_2)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 58 done (sys/def/47/column3_2$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 60 done (sys/def/59/PRIMARY)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 61 done (sys/def/59/column5)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 62 done (sys/def/59/column5$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 63 done (sys/def/59/column5_2)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 64 done (sys/def/59/column5_2$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 66 done (sys/def/65/PRIMARY)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 68 done (sys/def/67/PRIMARY)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 69 done (sys/def/67/column1)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 70 done (sys/def/67/column1$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 71 done (sys/def/67/column6)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 72 done (sys/def/67/column6$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 73 done (sys/def/67/column7)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 74 done (sys/def/67/column7$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 75 done (sys/def/67/column1_2)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 76 done (sys/def/67/column1_2$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 78 done (sys/def/77/PRIMARY)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 79 done (sys/def/77/column1)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 80 done (sys/def/77/column1$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 81 done (sys/def/77/column1_2)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 82 done (sys/def/77/column1_2$unique)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: DICT: activate index 83 done (sys/def/77/column3)
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 2: Node restart completed copy of dictionary information to Node 5
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 2: NR Status: node=5,OLD=Copy meta data to start node,NEW=Include node in LCP/GCP protocols
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 4: NR Status: node=5,OLD=All nodes permitted us,NEW=Include node in LCP/GCP protocols
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 3: NR Status: node=5,OLD=All nodes permitted us,NEW=Include node in LCP/GCP protocols
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 2: NR Status: node=5,OLD=Include node in LCP/GCP protocols,NEW=Restore fragments ongoing
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: Node restart starting to copy the fragments to Node 5
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: Restore Database Off-line Starting on node 5
2024-10-17 17:19:59 [MgmtSrvr] INFO -- Node 5: Node: 5 StartLog: [GCI Keep: 1 LastCompleted: 60 NewestRestorable: 66]
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 5: LDM instance 0: Restored LCP : 86 fragments, 0 rows, 341 millis, 0 rows/s
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 5: Node Start completed restore of LCP id: 1
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 2: NR Status: node=5,OLD=Restore fragments ongoing,NEW=Undo Disk data ongoing
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 2: NR Status: node=5,OLD=Undo Disk data ongoing,NEW=Execute REDO logs ongoing
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 2: NR Status: node=5,OLD=Execute REDO logs ongoing,NEW=Build indexes ongoing
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 5: Restore Database Off-line Completed on node 5
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 5: Bring Database On-line Starting on node 5
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 2: NR Status: node=5,OLD=Build indexes ongoing,NEW=Synchronize start node with live nodes
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 3: NR Status: node=5,OLD=Include node in LCP/GCP protocols,NEW=Synchronize start node with live nodes
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 4: NR Status: node=5,OLD=Include node in LCP/GCP protocols,NEW=Synchronize start node with live nodes
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 5: Node restart completed copying the fragments to Node 5
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 5: Bring Database On-line Completed on node 5
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 5: Starting REDO logging on node 5
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 5: LDM(0): CopyFrag complete. 160 frags, +968/-69587 rows, 4024772 bytes/277 ms 14529862 bytes/s.
2024-10-17 17:20:00 [MgmtSrvr] INFO -- Node 5: LDM(0): Completed LCP, #frags = 86 #records = 979, #bytes = 73776
2024-10-17 17:20:01 [MgmtSrvr] INFO -- Node 2: Make On-line Database recoverable by waiting for LCP Starting on node 5, LCP id 2
2024-10-17 17:20:01 [MgmtSrvr] INFO -- Node 2: NR Status: node=5,OLD=Synchronize start node with live nodes,NEW=Wait LCP to ensure durability
2024-10-17 17:20:01 [MgmtSrvr] INFO -- Node 3: NR Status: node=5,OLD=Synchronize start node with live nodes,NEW=Wait LCP to ensure durability
2024-10-17 17:20:01 [MgmtSrvr] INFO -- Node 4: NR Status: node=5,OLD=Synchronize start node with live nodes,NEW=Wait LCP to ensure durability
2024-10-17 17:20:03 [MgmtSrvr] INFO -- Node 2: Local checkpoint 2 started. Keep GCI = 1 oldest restorable GCI = 2
2024-10-17 17:20:05 [MgmtSrvr] INFO -- Node 2: LDM(0): Completed LCP, #frags = 86 #records = 957, #bytes = 73760
2024-10-17 17:20:05 [MgmtSrvr] INFO -- Node 5: LDM(0): Completed LCP, #frags = 86 #records = 979, #bytes = 73776
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 3: LDM(0): Completed LCP, #frags = 86 #records = 979, #bytes = 73776
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 4: LDM(0): Completed LCP, #frags = 86 #records = 957, #bytes = 73760
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 2: NR Status: node=5,OLD=Wait LCP to ensure durability,NEW=Wait handover of subscriptions
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 2: Local checkpoint 2 completed
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 3: NR Status: node=5,OLD=Wait LCP to ensure durability,NEW=Restart completed
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 4: NR Status: node=5,OLD=Wait LCP to ensure durability,NEW=Restart completed
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 5: Suma: asking node 2 to recreate subscriptions on me
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 5: Suma: node 2 has completed restoring me
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 5: Start phase 5 completed (node restart)
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 5: Start phase 6 completed (node restart)
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 5: Start phase 7 completed (node restart)
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 5: DICT: enable FK 27 done (13/21/test1_fk_1)
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 5: DICT: enable FK 46 done (34/36/test4_fk_1)
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 5: DICT: enable FK 84 done (13/77/test9_fk_1)
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 5: Start phase 8 completed (node restart)
2024-10-17 17:20:07 [MgmtSrvr] INFO -- Node 5: Node 1: API mysql-8.0.40 ndb-8.0.40
2024-10-17 17:20:09 [MgmtSrvr] INFO -- Node 5: Start phase 9 completed (node restart)
2024-10-17 17:20:09 [MgmtSrvr] INFO -- Node 5: Start phase 50 completed (node restart)
2024-10-17 17:20:10 [MgmtSrvr] INFO -- Node 5: Communication to Node 6 opened
2024-10-17 17:20:10 [MgmtSrvr] INFO -- Node 5: Communication to Node 7 opened
2024-10-17 17:20:10 [MgmtSrvr] INFO -- Node 5: Communication to Node 8 opened
2024-10-17 17:20:10 [MgmtSrvr] INFO -- Node 5: Communication to Node 9 opened
2024-10-17 17:20:10 [MgmtSrvr] INFO -- Node 5: Node 6 Connected
2024-10-17 17:20:10 [MgmtSrvr] INFO -- Node 5: Node 7 Connected
2024-10-17 17:20:10 [MgmtSrvr] INFO -- Node 5: Node 8 Connected
2024-10-17 17:20:10 [MgmtSrvr] INFO -- Node 5: Node 9 Connected
2024-10-17 17:20:10 [MgmtSrvr] INFO -- Node 5: Node 9: API mysql-8.0.40 ndb-8.0.40
2024-10-17 17:20:10 [MgmtSrvr] INFO -- Node 5: Node 6: API mysql-8.0.40 ndb-8.0.40
2024-10-17 17:20:10 [MgmtSrvr] INFO -- Node 5: Node 7: API mysql-8.0.40 ndb-8.0.40
2024-10-17 17:20:10 [MgmtSrvr] INFO -- Node 5: Node 8: API mysql-8.0.40 ndb-8.0.40
2024-10-17 17:20:10 [MgmtSrvr] INFO -- Node 5: Suma: initiate handover for startup with nodes 0000000000000000000000000000000000000008 GCI: 75
2024-10-17 17:20:10 [MgmtSrvr] INFO -- Node 5: Suma: handover from node 3 gci: 75 buckets: 00000002 (2)
2024-10-17 17:20:15 [MgmtSrvr] INFO -- Node 2: NR Status: node=5,OLD=Wait handover of subscriptions,NEW=Restart completed
2024-10-17 17:20:15 [MgmtSrvr] INFO -- Node 5: Start phase 101 completed (node restart)
2024-10-17 17:20:15 [MgmtSrvr] INFO -- Node 5: Started (mysql-8.0.40 ndb-8.0.40)
2024-10-17 17:20:15 [MgmtSrvr] INFO -- Node 2: DICT: unlocked by node 5 for NodeRestart
2024-10-17 17:20:18 [MgmtSrvr] INFO -- Node 2: Cluster shutdown initiated
2024-10-17 17:20:18 [MgmtSrvr] INFO -- Node 3: Cluster shutdown initiated
2024-10-17 17:20:18 [MgmtSrvr] INFO -- Node 4: Cluster shutdown initiated
2024-10-17 17:20:18 [MgmtSrvr] INFO -- Node 5: Cluster shutdown initiated
2024-10-17 17:20:24 [MgmtSrvr] INFO -- Transporter 3 to node 4 disconnected in state: 0
2024-10-17 17:20:25 [MgmtSrvr] INFO -- Transporter 4 to node 5 disconnected in state: 0
2024-10-17 17:20:25 [MgmtSrvr] INFO -- Transporter 2 to node 3 disconnected in state: 0
2024-10-17 17:20:25 [MgmtSrvr] INFO -- Node 5: Node shutdown completed.
2024-10-17 17:20:25 [MgmtSrvr] ALERT -- Node 1: Node 3 Disconnected
2024-10-17 17:20:25 [MgmtSrvr] ALERT -- Node 1: Node 4 Disconnected
2024-10-17 17:20:25 [MgmtSrvr] ALERT -- Node 1: Node 5 Disconnected
2024-10-17 17:20:25 [MgmtSrvr] INFO -- Node 4: Node shutdown completed.
2024-10-17 17:20:25 [MgmtSrvr] INFO -- Node 3: Node shutdown completed.
```
I'll make all the relevant log files available, which will contain more details.
Suggested fix:
Please check the data synchronization process due to node related actions.