Bug #10893 | Cluster hangs on initial startup | ||
---|---|---|---|
Submitted: | 26 May 2005 21:55 | Modified: | 14 Jun 2005 4:41 |
Reporter: | Jonathan Miller | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S2 (Serious) |
Version: | 5.0, 5.1 | OS: | Linux (Linux) |
Assigned to: | Stewart Smith | CPU Architecture: | Any |
[26 May 2005 21:55]
Jonathan Miller
[27 May 2005 5:42]
Jonas Oreland
Hi, They will not actuall hang...but wait for a quite long time. They they will terminate with an error message saying that it failed to contact managment server. I.e. since you have modified the default port that the mgmd is running on you have to tell the ndbd aswell.
[31 May 2005 16:27]
Jonathan Miller
So I can repeat this everytime on initial startup, but also found away around it. To recreate: Complile latest clone. Use two systems. Configure as follows: System1: ndb_mgmd, ndbd, ndbd System2: ndbd, ndbd my compiled version gets depolyed under a directory called builds. Under builds on both systems I create a run directory. From the run directory on system1 I do the follwing without much of any think time(delay): ndb08:~/jmiller/builds/run> ../libexec/ndb_mgmd -f config.ini ndb08:~/jmiller/builds/run> ../libexec/ndbd -c ndb08:14000 --initial ndb08:~/jmiller/builds/run> ../libexec/ndbd -c ndb08:14000 --initial From system2 under the run directory without think time: ndb09 run]$ ../libexec/ndbd -c ndb08:14000 --initial ndb09 run]$ ../libexec/ndbd -c ndb08:14000 --initial The system is now hung with the two data nodes on system1 stuck in phase1 and the two on system2 stuck in phase 0. ndb_mgm> all status Node 2: starting (Phase 1) (Version 5.1.0) Node 3: starting (Phase 0) (Version 5.1.0) Node 4: starting (Phase 1) (Version 5.1.0) Node 5: starting (Phase 0) (Version 5.1.0) I tried starting them in a different pattern: system2 system1 system2 system1 But without a delay they were still stuck: ndb_mgm> all status Node 2: starting (Phase 1) (Version 5.1.0) Node 3: starting (Phase 1) (Version 5.1.0) Node 4: starting (Phase 0) (Version 5.1.0) Node 5: starting (Phase 0) (Version 5.1.0) After the second time, I tried again: system1 system2 system1 system2 This time I watched the cluster.log to ensure that each had open connection for the others before starting the next. The cluster made it through all phases. Cluster.log: 2005-05-31 17:43:12 [MgmSrvr] INFO -- NDB Cluster Management Server. Version 5.1.0 (rowrepl_drop1) 2005-05-31 17:43:12 [MgmSrvr] INFO -- Id: 1, Command port: 14000 2005-05-31 17:43:39 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 reserved for ip 10.100.1.93, m_reserved_nodes 0000000000000006. 2005-05-31 17:43:40 [MgmSrvr] INFO -- Node 1: Node 2 Connected 2005-05-31 17:43:40 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 freed, m_reserved_nodes 0000000000000002. 2005-05-31 17:43:40 [MgmSrvr] INFO -- Mgmt server state: nodeid 4 reserved for ip 10.100.1.93, m_reserved_nodes 0000000000000012. 2005-05-31 17:43:41 [MgmSrvr] INFO -- Node 2: Start phase 0 completed 2005-05-31 17:43:41 [MgmSrvr] INFO -- Node 2: Communication to Node 3 opened2005-05-31 17:43:41 [MgmSrvr] INFO -- Node 2: Communication to Node 4 opened2005-05-31 17:43:41 [MgmSrvr] INFO -- Node 2: Communication to Node 5 opened2005-05-31 17:43:41 [MgmSrvr] INFO -- Node 1: Node 4 Connected 2005-05-31 17:43:42 [MgmSrvr] INFO -- Node 4: Start phase 0 completed 2005-05-31 17:43:42 [MgmSrvr] INFO -- Node 4: Communication to Node 2 opened2005-05-31 17:43:42 [MgmSrvr] INFO -- Node 4: Communication to Node 3 opened2005-05-31 17:43:42 [MgmSrvr] INFO -- Node 4: Communication to Node 5 opened2005-05-31 17:43:42 [MgmSrvr] INFO -- Node 2: Node 4 Connected 2005-05-31 17:43:42 [MgmSrvr] INFO -- Node 4: Node 2 Connected 2005-05-31 17:43:42 [MgmSrvr] INFO -- Mgmt server state: nodeid 4 freed, m_r 2005-05-31 17:44:11 [MgmSrvr] INFO -- Node 2: Start phase 1 completed 2005-05-31 17:44:11 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 reserved for ip 10.100.1.94, m_reserved_nodes 000000000000000a. 2005-05-31 17:44:12 [MgmSrvr] INFO -- Node 4: CM_REGCONF president = 2, own Node = 4, our dynamic id = 2 2005-05-31 17:44:12 [MgmSrvr] INFO -- Node 2: Node 4: API version 5.1.0 2005-05-31 17:44:12 [MgmSrvr] INFO -- Node 4: Node 2: API version 5.1.0 2005-05-31 17:44:12 [MgmSrvr] INFO -- Node 4: Start phase 1 completed 2005-05-31 17:44:12 [MgmSrvr] INFO -- Mgmt server state: nodeid 5 reserved for ip 10.100.1.94, m_reserved_nodes 000000000000002a. 2005-05-31 17:44:13 [MgmSrvr] INFO -- Node 1: Node 3 Connected 2005-05-31 17:44:13 [MgmSrvr] INFO -- Node 3: Start phase 0 completed 2005-05-31 17:44:13 [MgmSrvr] INFO -- Node 3: Communication to Node 2 opened2005-05-31 17:44:13 [MgmSrvr] INFO -- Node 3: Communication to Node 4 opened2005-05-31 17:44:13 [MgmSrvr] INFO -- Node 3: Communication to Node 5 opened2005-05-31 17:44:13 [MgmSrvr] INFO -- Node 3: Node 2 Connected 2005-05-31 17:44:13 [MgmSrvr] INFO -- Node 2: Node 3 Connected 2005-05-31 17:44:13 [MgmSrvr] INFO -- Node 3: CM_REGCONF president = 2, own Node = 3, our dynamic id = 3 2005-05-31 17:44:13 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 freed, m_reserved_nodes 0000000000000022. 2005-05-31 17:44:14 [MgmSrvr] INFO -- Node 1: Node 5 Connected 2005-05-31 17:44:14 [MgmSrvr] INFO -- Node 5: Start phase 0 completed 2005-05-31 17:44:14 [MgmSrvr] INFO -- Node 5: Communication to Node 3 opened2005-05-31 17:44:14 [MgmSrvr] INFO -- Node 5: Communication to Node 4 opened2005-05-31 17:44:14 [MgmSrvr] INFO -- Node 4: Node 5 Connected 2005-05-31 17:44:14 [MgmSrvr] INFO -- Node 5: Node 2 Connected 2005-05-31 17:44:14 [MgmSrvr] INFO -- Node 5: Node 3 Connected 2005-05-31 17:44:14 [MgmSrvr] INFO -- Node 5: Node 4 Connected 2005-05-31 17:44:14 [MgmSrvr] INFO -- Node 2: Node 5 Connected 2005-05-31 17:44:14 [MgmSrvr] INFO -- Node 3: Node 5 Connected 2005-05-31 17:44:14 [MgmSrvr] INFO -- Mgmt server state: nodeid 5 freed, m_reserved_nodes 0000000000000002. 2005-05-31 17:44:14 [MgmSrvr] INFO -- Node 1: Node 4 Connected 2005-05-31 17:44:14 [MgmSrvr] INFO -- Node 1: Node 2 Connected 2005-05-31 17:44:15 [MgmSrvr] INFO -- Node 1: Node 5 Connected 2005-05-31 17:44:15 [MgmSrvr] INFO -- Node 1: Node 3 Connected 2005-05-31 17:45:38 [MgmSrvr] INFO -- Shutting down server... 2005-05-31 17:45:38 [MgmSrvr] INFO -- Shutdown complete 2005-05-31 17:45:40 [MgmSrvr] INFO -- Id: 1, Command port: 14000 2005-05-31 17:45:47 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 reserved for ip 10.100.1.93, m_reserved_nodes 0000000000000006. 2005-05-31 17:45:48 [MgmSrvr] INFO -- Mgmt server state: nodeid 4 reserved for ip 10.100.1.93, m_reserved_nodes 0000000000000016. 2005-05-31 17:45:49 [MgmSrvr] INFO -- Node 1: Node 2 Connected 2005-05-31 17:45:49 [MgmSrvr] INFO -- Node 2: Start phase 0 completed 2005-05-31 17:45:49 [MgmSrvr] INFO -- Node 2: Communication to Node 3 opened 2005-05-31 17:45:49 [MgmSrvr] INFO -- Node 2: Communication to Node 4 opened 2005-05-31 17:45:49 [MgmSrvr] INFO -- Node 2: Communication to Node 5 opened 2005-05-31 17:45:49 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 freed, m_reserved_nodes 0000000000000012. 2005-05-31 17:45:49 [MgmSrvr] INFO -- Node 1: Node 4 Connected 2005-05-31 17:45:50 [MgmSrvr] INFO -- Node 4: Start phase 0 completed 2005-05-31 17:45:50 [MgmSrvr] INFO -- Node 4: Communication to Node 2 opened 2005-05-31 17:45:50 [MgmSrvr] INFO -- Node 4: Communication to Node 3 opened 2005-05-31 17:45:50 [MgmSrvr] INFO -- Node 4: Communication to Node 5 opened 2005-05-31 17:45:50 [MgmSrvr] INFO -- Node 4: Node 2 Connected 2005-05-31 17:45:50 [MgmSrvr] INFO -- Node 2: Node 4 Connected 2005-05-31 17:45:50 [MgmSrvr] INFO -- Mgmt server state: nodeid 4 freed, m_reserved_nodes 0000000000000002. 2005-05-31 17:45:51 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 reserved for ip 10.100.1.94, m_reserved_nodes 000000000000000a. 2005-05-31 17:45:52 [MgmSrvr] INFO -- Mgmt server state: nodeid 5 reserved for ip 10.100.1.94, m_reserved_nodes 000000000000002a. 2005-05-31 17:45:52 [MgmSrvr] INFO -- Node 1: Node 3 Connected 2005-05-31 17:45:53 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 freed, m_reserved_nodes 0000000000000022. 2005-05-31 17:45:53 [MgmSrvr] INFO -- Node 1: Node 5 Connected 2005-05-31 17:45:53 [MgmSrvr] INFO -- Node 2: Node 3 Connected 2005-05-31 17:45:53 [MgmSrvr] INFO -- Node 3: Node 2 Connected 2005-05-31 17:45:53 [MgmSrvr] INFO -- Node 5: Start phase 0 completed 2005-05-31 17:45:53 [MgmSrvr] INFO -- Node 5: Communication to Node 2 opened 2005-05-31 17:45:53 [MgmSrvr] INFO -- Node 5: Communication to Node 3 opened 2005-05-31 17:45:53 [MgmSrvr] INFO -- Node 5: Communication to Node 4 opened 2005-05-31 17:45:53 [MgmSrvr] INFO -- Node 5: Node 2 Connected 2005-05-31 17:45:53 [MgmSrvr] INFO -- Node 5: Node 3 Connected 2005-05-31 17:45:53 [MgmSrvr] INFO -- Node 5: Node 4 Connected 2005-05-31 17:45:53 [MgmSrvr] INFO -- Node 2: Node 5 Connected 2005-05-31 17:45:53 [MgmSrvr] INFO -- Node 3: Node 5 Connected 2005-05-31 17:45:53 [MgmSrvr] INFO -- Node 4: Node 5 Connected 2005-05-31 17:45:53 [MgmSrvr] INFO -- Node 2: Start phase 1 completed 2005-05-31 17:45:54 [MgmSrvr] INFO -- Mgmt server state: nodeid 5 freed, m_reserved_nodes 0000000000000002. 2005-05-31 17:45:56 [MgmSrvr] INFO -- Node 3: CM_REGCONF president = 2, own Node = 3, our dynamic id = 2 2005-05-31 17:45:56 [MgmSrvr] INFO -- Node 2: Node 3: API version 5.1.0 2005-05-31 17:45:56 [MgmSrvr] INFO -- Node 3: Node 2: API version 5.1.0 2005-05-31 17:45:56 [MgmSrvr] INFO -- Node 3: Start phase 1 completed 2005-05-31 17:45:56 [MgmSrvr] INFO -- Node 4: CM_REGCONF president = 2, own Node = 4, our dynamic id = 3 2005-05-31 17:46:38 [MgmSrvr] INFO -- Node 1: Node 2 Connected 2005-05-31 17:46:39 [MgmSrvr] INFO -- Node 1: Node 5 Connected 2005-05-31 17:46:39 [MgmSrvr] INFO -- Node 1: Node 3 Connected 2005-05-31 17:46:39 [MgmSrvr] INFO -- Node 1: Node 4 Connected 2005-05-31 17:47:08 [MgmSrvr] INFO -- Shutting down server... 2005-05-31 17:47:08 [MgmSrvr] INFO -- Shutdown complete 2005-05-31 17:47:11 [MgmSrvr] INFO -- NDB Cluster Management Server. Version 5.1.0 (rowrepl_drop1) 2005-05-31 17:47:11 [MgmSrvr] INFO -- Id: 1, Command port: 14000 2005-05-31 17:47:15 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 reserved for ip 10.100.1.94, m_reserved_nodes 000000000000000a. 2005-05-31 17:47:16 [MgmSrvr] INFO -- Node 1: Node 3 Connected 2005-05-31 17:47:17 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 freed, m_reserved_nodes 0000000000000002. 2005-05-31 17:47:20 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 reserved for ip 10.100.1.93, m_reserved_nodes 0000000000000006. 2005-05-31 17:47:21 [MgmSrvr] INFO -- Node 1: Node 2 Connected 2005-05-31 17:47:22 [MgmSrvr] INFO -- Node 2: Start phase 0 completed 2005-05-31 17:47:22 [MgmSrvr] INFO -- Node 2: Communication to Node 3 opened 2005-05-31 17:47:22 [MgmSrvr] INFO -- Node 2: Communication to Node 4 opened 2005-05-31 17:47:22 [MgmSrvr] INFO -- Node 2: Communication to Node 5 opened 2005-05-31 17:47:22 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 freed, m_reserved_nodes 0000000000000002. 2005-05-31 17:47:22 [MgmSrvr] INFO -- Mgmt server state: nodeid 5 reserved for ip 10.100.1.94, m_reserved_nodes 0000000000000022. 2005-05-31 17:47:23 [MgmSrvr] INFO -- Node 1: Node 5 Connected 2005-05-31 17:47:24 [MgmSrvr] INFO -- Node 5: Start phase 0 completed 2005-05-31 17:47:24 [MgmSrvr] INFO -- Node 5: Communication to Node 2 opened 2005-05-31 17:47:24 [MgmSrvr] INFO -- Node 5: Communication to Node 3 opened 2005-05-31 17:47:24 [MgmSrvr] INFO -- Node 5: Communication to Node 4 opened 2005-05-31 17:47:24 [MgmSrvr] INFO -- Node 5: Node 2 Connected 2005-05-31 17:47:24 [MgmSrvr] INFO -- Node 5: Node 3 Connected 2005-05-31 17:47:24 [MgmSrvr] INFO -- Node 2: Node 5 Connected 2005-05-31 17:47:24 [MgmSrvr] INFO -- Node 3: Node 5 Connected 2005-05-31 17:47:24 [MgmSrvr] INFO -- Mgmt server state: nodeid 5 freed, m_reserved_nodes 0000000000000002. 2005-05-31 17:47:25 [MgmSrvr] INFO -- Mgmt server state: nodeid 4 reserved for ip 10.100.1.93, m_reserved_nodes 0000000000000012. 2005-05-31 17:47:26 [MgmSrvr] INFO -- Node 1: Node 4 Connected 2005-05-31 17:47:27 [MgmSrvr] INFO -- Node 4: Start phase 0 completed 2005-05-31 17:47:27 [MgmSrvr] INFO -- Node 4: Communication to Node 2 opened 2005-05-31 17:47:27 [MgmSrvr] INFO -- Node 4: Communication to Node 3 opened 2005-05-31 17:47:27 [MgmSrvr] INFO -- Node 4: Communication to Node 5 opened 2005-05-31 17:47:27 [MgmSrvr] INFO -- Node 4: Node 2 Connected 2005-05-31 17:47:27 [MgmSrvr] INFO -- Node 4: Node 3 Connected 2005-05-31 17:47:27 [MgmSrvr] INFO -- Node 2: Node 4 Connected 2005-05-31 17:47:27 [MgmSrvr] INFO -- Node 3: Node 4 Connected 2005-05-31 17:47:27 [MgmSrvr] INFO -- Mgmt server state: nodeid 4 freed, m_reserved_nodes 0000000000000002. 2005-05-31 17:47:52 [MgmSrvr] INFO -- Node 2: Start phase 1 completed 2005-05-31 17:47:54 [MgmSrvr] INFO -- Node 4: CM_REGCONF president = 2, own Node = 4, our dynamic id = 2 2005-05-31 17:47:54 [MgmSrvr] INFO -- Node 2: Node 4: API version 5.1.0 2005-05-31 17:47:54 [MgmSrvr] INFO -- Node 4: Node 2: API version 5.1.0 2005-05-31 17:47:54 [MgmSrvr] INFO -- Node 4: Start phase 1 completed 2005-05-31 17:47:54 [MgmSrvr] INFO -- Node 5: CM_REGCONF president = 2, own Node = 5, our dynamic id = 3 2005-05-31 17:50:14 [MgmSrvr] INFO -- Node 1: Node 3 Connected 2005-05-31 17:50:14 [MgmSrvr] INFO -- Node 1: Node 4 Connected 2005-05-31 17:50:14 [MgmSrvr] INFO -- Node 1: Node 5 Connected 2005-05-31 17:50:15 [MgmSrvr] INFO -- Node 1: Node 2 Connected 2005-05-31 17:50:15 [MgmSrvr] INFO -- Shutting down server... 2005-05-31 17:50:15 [MgmSrvr] INFO -- Shutdown complete 2005-05-31 17:50:28 [MgmSrvr] INFO -- NDB Cluster Management Server. Version 5.1.0 (rowrepl_drop1) 2005-05-31 17:50:28 [MgmSrvr] INFO -- Id: 1, Command port: 14000 2005-05-31 17:50:43 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 reserved for ip 10.100.1.93, m_reserved_nodes 0000000000000006. 2005-05-31 17:50:44 [MgmSrvr] INFO -- Node 1: Node 2 Connected 2005-05-31 17:50:45 [MgmSrvr] INFO -- Node 2: Start phase 0 completed 2005-05-31 17:50:45 [MgmSrvr] INFO -- Node 2: Communication to Node 3 opened 2005-05-31 17:50:45 [MgmSrvr] INFO -- Node 2: Communication to Node 4 opened 2005-05-31 17:50:45 [MgmSrvr] INFO -- Node 2: Communication to Node 5 opened 2005-05-31 17:50:45 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 freed, m_reserved_nodes 0000000000000002. 2005-05-31 17:51:05 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 reserved for ip 10.100.1.94, m_reserved_nodes 000000000000000a. 2005-05-31 17:51:06 [MgmSrvr] INFO -- Node 1: Node 3 Connected 2005-05-31 17:51:06 [MgmSrvr] INFO -- Node 3: Start phase 0 completed 2005-05-31 17:51:06 [MgmSrvr] INFO -- Node 3: Communication to Node 2 opened 2005-05-31 17:51:06 [MgmSrvr] INFO -- Node 3: Communication to Node 4 opened 2005-05-31 17:51:06 [MgmSrvr] INFO -- Node 3: Communication to Node 5 opened 2005-05-31 17:51:06 [MgmSrvr] INFO -- Node 2: Node 3 Connected 2005-05-31 17:51:06 [MgmSrvr] INFO -- Node 3: Node 2 Connected 2005-05-31 17:51:07 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 freed, m_reserved_nodes 0000000000000002. 2005-05-31 17:51:18 [MgmSrvr] INFO -- Node 2: Start phase 1 completed 2005-05-31 17:51:18 [MgmSrvr] INFO -- Node 3: CM_REGCONF president = 2, own Node = 3, our dynamic id = 2 2005-05-31 17:51:18 [MgmSrvr] INFO -- Node 2: Node 3: API version 5.1.0 2005-05-31 17:51:18 [MgmSrvr] INFO -- Node 3: Node 2: API version 5.1.0 2005-05-31 17:51:18 [MgmSrvr] INFO -- Node 3: Start phase 1 completed 2005-05-31 17:51:32 [MgmSrvr] INFO -- Mgmt server state: nodeid 4 reserved for ip 10.100.1.93, m_reserved_nodes 0000000000000012. 2005-05-31 17:51:33 [MgmSrvr] INFO -- Node 1: Node 4 Connected 2005-05-31 17:51:34 [MgmSrvr] INFO -- Node 4: Start phase 0 completed 2005-05-31 17:51:34 [MgmSrvr] INFO -- Node 4: Communication to Node 2 opened 2005-05-31 17:51:34 [MgmSrvr] INFO -- Node 4: Communication to Node 3 opened 2005-05-31 17:51:34 [MgmSrvr] INFO -- Node 4: Communication to Node 5 opened 2005-05-31 17:51:34 [MgmSrvr] INFO -- Node 3: Node 4 Connected 2005-05-31 17:51:34 [MgmSrvr] INFO -- Node 2: Node 4 Connected 2005-05-31 17:51:34 [MgmSrvr] INFO -- Node 4: Node 2 Connected 2005-05-31 17:51:34 [MgmSrvr] INFO -- Node 4: Node 3 Connected 2005-05-31 17:51:34 [MgmSrvr] INFO -- Node 4: CM_REGCONF president = 2, own Node = 4, our dynamic id = 3 2005-05-31 17:51:34 [MgmSrvr] INFO -- Node 2: Node 4: API version 5.1.0 2005-05-31 17:51:34 [MgmSrvr] INFO -- Node 3: Node 4: API version 5.1.0 2005-05-31 17:51:34 [MgmSrvr] INFO -- Node 4: Node 2: API version 5.1.0 2005-05-31 17:51:34 [MgmSrvr] INFO -- Node 4: Node 3: API version 5.1.0 2005-05-31 17:51:34 [MgmSrvr] INFO -- Node 4: Start phase 1 completed 2005-05-31 17:51:34 [MgmSrvr] INFO -- Mgmt server state: nodeid 4 freed, m_reserved_nodes 0000000000000002. 2005-05-31 17:51:55 [MgmSrvr] INFO -- Mgmt server state: nodeid 5 reserved for ip 10.100.1.94, m_reserved_nodes 0000000000000022. 2005-05-31 17:51:56 [MgmSrvr] INFO -- Node 1: Node 5 Connected 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 5: Start phase 0 completed 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 5: Communication to Node 2 opened 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 5: Communication to Node 3 opened 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 5: Communication to Node 4 opened 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 3: Node 5 Connected 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 2: Node 5 Connected 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 5: Node 2 Connected 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 5: Node 3 Connected 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 5: Node 4 Connected 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 4: Node 5 Connected 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 5: CM_REGCONF president = 2, own Node = 5, our dynamic id = 4 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 2: Node 5: API version 5.1.0 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 4: Node 5: API version 5.1.0 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 3: Node 5: API version 5.1.0 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 5: Node 2: API version 5.1.0 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 5: Node 3: API version 5.1.0 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 5: Node 4: API version 5.1.0 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 5: Start phase 1 completed 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 5: Start phase 2 completed (initial start) 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 4: Start phase 2 completed (initial start) 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 3: Start phase 2 completed (initial start) 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 2: Start phase 2 completed (initial start) 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 5: Start phase 3 completed (initial start) 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 4: Start phase 3 completed (initial start) 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 3: Start phase 3 completed (initial start) 2005-05-31 17:51:57 [MgmSrvr] INFO -- Node 2: Start phase 3 completed (initial start) 2005-05-31 17:51:57 [MgmSrvr] INFO -- Mgmt server state: nodeid 5 freed, m_reserved_nodes 0000000000000002. 2005-05-31 17:54:27 [MgmSrvr] INFO -- Node 2: Start phase 4 completed (initial start) 2005-05-31 17:54:27 [MgmSrvr] INFO -- Node 4: Start phase 4 completed (initial start) 2005-05-31 17:54:27 [MgmSrvr] INFO -- Node 3: Start phase 4 completed (initial start) 2005-05-31 17:54:27 [MgmSrvr] INFO -- Node 5: Start phase 4 completed (initial start) 2005-05-31 17:54:33 [MgmSrvr] INFO -- Node 2: Local checkpoint 1 started. Keep GCI = 1 oldest restorable GCI = 1 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Start phase 5 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Start phase 6 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: President restarts arbitration thread [state=1] 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Start phase 5 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Start phase 6 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Start phase 7 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Start phase 5 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Start phase 6 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Start phase 7 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Start phase 5 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Start phase 6 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Start phase 7 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Start phase 7 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Communication to Node 6 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Communication to Node 7 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Communication to Node 8 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Communication to Node 9 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Communication to Node 10 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Communication to Node 11 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Communication to Node 12 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Communication to Node 13 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Communication to Node 14 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Communication to Node 14 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Communication to Node 15 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Start phase 8 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Start phase 9 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Communication to Node 6 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Communication to Node 7 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Communication to Node 8 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Communication to Node 9 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Communication to Node 10 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Communication to Node 11 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Communication to Node 12 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Communication to Node 13 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Communication to Node 14 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Start phase 8 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Start phase 9 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Communication to Node 6 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Communication to Node 7 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Communication to Node 8 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Communication to Node 9 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Communication to Node 10 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Communication to Node 11 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Communication to Node 12 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Communication to Node 13 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Communication to Node 14 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Communication to Node 15 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Start phase 100 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Start phase 101 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 2: Started (version 5.1.0) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Start phase 100 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Start phase 101 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 5: Started (version 5.1.0) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 3: Started (version 5.1.0) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Communication to Node 6 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Communication to Node 7 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Communication to Node 8 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Communication to Node 9 opened2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Communication to Node 10 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Communication to Node 11 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Communication to Node 12 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Communication to Node 13 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Communication to Node 14 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Communication to Node 15 opened 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Start phase 8 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Start phase 9 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Start phase 100 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Start phase 101 completed (initial start) 2005-05-31 17:54:53 [MgmSrvr] INFO -- Node 4: Started (version 5.1.0) 2005-05-31 17:54:54 [MgmSrvr] INFO -- Node 2: Node 1: API version 5.1.0 2005-05-31 17:54:54 [MgmSrvr] INFO -- Node 4: Node 1: API version 5.1.0 2005-05-31 17:54:54 [MgmSrvr] INFO -- Node 3: Node 1: API version 5.1.0 2005-05-31 17:54:54 [MgmSrvr] INFO -- Node 5: Node 1: API version 5.1.0 2005-05-31 17:54:55 [MgmSrvr] INFO -- Node 4: Prepare arbitrator node 1 [ticket=282300013376646b] 2005-05-31 17:54:55 [MgmSrvr] INFO -- Node 3: Prepare arbitrator node 1 [ticket=282300013376646b] 2005-05-31 17:54:55 [MgmSrvr] INFO -- Node 2: Started arbitrator node 1 [ticket=282300013376646b]
[2 Jun 2005 6:11]
Stewart Smith
Cannot duplicate on local machine. Have duplicated on ndb08 and ndb09 however. Building my own clone and will add some debugging output to try and work out what is going on. One of the connections (node 3 to node 4) is not being made. The port is being listened to, the mgm server knows the port number, just the connection isn't made.
[6 Jun 2005 2:51]
Stewart Smith
This looks to be in 5.0 as well. I know where the problem is, creating patch later today.
[6 Jun 2005 15:37]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/25652
[7 Jun 2005 3:38]
Stewart Smith
Has been reported to happen in 4.1 as well. Tried my fix on ndb08 and ndb09. Wasn't a fix. Not convinced it doesn't fix a potential problem though. One of the connections between the nodes is being started, but not moving into 'connected'. Will have to investigate further.
[8 Jun 2005 7:42]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/25744
[8 Jun 2005 8:47]
Stewart Smith
This fix is for 5.0 and above. The 4.1 problem is something else, we are currently investigating. The workaround for this bug would be to have a BasePort in the config.ini as this was only a problem when using dynamic ports. Pushed to 5.0
[14 Jun 2005 4:41]
Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release. If necessary, you can access the source repository and build the latest available version, including the bugfix, yourself. More information about accessing the source trees is available at http://www.mysql.com/doc/en/Installing_source_tree.html Additional info: Documented bufix in 5.0.8 Change History (Cluster) - marked Closed.
[23 Jul 2005 14:58]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/27523