Bug #57347 | Mysql cluster missed heartbeat | ||
---|---|---|---|
Submitted: | 9 Oct 2010 5:39 | Modified: | 9 Mar 2011 13:43 |
Reporter: | Sebastian Stach | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S2 (Serious) |
Version: | mysql-5.1-telco-7.1 | OS: | Linux (Debian Lenny) |
Assigned to: | CPU Architecture: | Any | |
Tags: | missed heartbeat, MySQL Cluster, mysql-5.1.44 ndb-7.1.9a, ndb |
[9 Oct 2010 5:39]
Sebastian Stach
[9 Oct 2010 5:40]
Sebastian Stach
ndb_3_trace
Attachment: ndb_3_trace.log.rar (application/octet-stream, text), 27.50 KiB.
[9 Mar 2011 13:43]
Sebastian Stach
I have this problem almost every week. Right now i have disconnect and connect on sql nodes. No missed heartbeat, this are entries from ndb_1_cluster.log: 2011-02-24 10:33:21 [MgmtSrvr] ALERT -- Node 3: Node 7 Disconnected 2011-02-24 10:33:21 [MgmtSrvr] INFO -- Node 3: Communication to Node 7 closed 2011-02-24 10:33:21 [MgmtSrvr] INFO -- Node 2: Communication to Node 7 closed 2011-02-24 10:33:21 [MgmtSrvr] ALERT -- Node 2: Node 7 Disconnected 2011-02-24 10:33:23 [MgmtSrvr] INFO -- Node 2: Communication to Node 7 opened 2011-02-24 10:33:23 [MgmtSrvr] INFO -- Node 2: Node 7 Connected 2011-02-24 10:33:23 [MgmtSrvr] INFO -- Node 2: Node 7: API mysql-5.1.51 ndb-7.1.9 2011-02-24 10:33:25 [MgmtSrvr] INFO -- Node 3: Communication to Node 7 opened 2011-02-24 10:33:25 [MgmtSrvr] INFO -- Node 3: Node 7 Connected 2011-02-24 10:33:25 [MgmtSrvr] INFO -- Node 3: Node 7: API mysql-5.1.51 ndb-7.1.9 This are logs from node 7, this reconnection always took about 4s. 110224 10:33:21 [ERROR] Got error 4028 when reading table './ps/aa_nodes' 110224 10:33:21 [ERROR] Got error 4010 when reading table './ps/brds' 110224 10:33:21 [ERROR] Got error 4010 when reading table './ps/brds' 110224 10:33:21 [ERROR] Got error 4010 when reading table './ps/cats' 110224 10:33:21 [ERROR] Got error 4010 when reading table './ps/cats' 110224 10:33:21 [ERROR] Got error 4010 when reading table './ps/actions' 110224 10:33:21 [Note] NDB Binlog: Node: 2, down, Subscriber bitmask 00 110224 10:33:21 [Note] NDB Binlog: Node: 3, down, Subscriber bitmask 00 110224 10:33:21 [Note] NDB Binlog: cluster failure for ./mysql/ndb_schema at epoch 28285206/7. 110224 10:33:21 [ERROR] Got error 4010 when reading table './ps/actions' 110224 10:33:21 [ERROR] Got error 4010 when reading table './ps/actions' 110224 10:33:21 [ERROR] Got error 4010 when reading table './ps/actions' 110224 10:33:21 [ERROR] Got error 4010 when reading table './ps/cats' 110224 10:33:21 [ERROR] Got error 4010 when reading table './ps/cats' 110224 10:33:21 [ERROR] Got error 4010 when reading table './ps/cats' 110224 10:33:21 [ERROR] Got error 157 when reading table './ps/actions' 110224 10:33:21 [ERROR] Got error 157 when reading table './ps/users' 110224 10:33:21 [ERROR] /usr/local/mysql/bin/mysqld: Incorrect information in file: './ps/categories_premium.frm' 110224 10:33:21 [ERROR] Got error 157 when reading table './ps/actions_status' 110224 10:33:21 [ERROR] Got error 157 when reading table './ps/actions' 110224 10:33:21 [ERROR] Got error 157 when reading table './ps/actions' 110224 10:33:21 [ERROR] Got error 157 when reading table './ps/actions' 110224 10:33:21 [ERROR] Got error 157 when reading table './ps/actions' 110224 10:33:21 [ERROR] Got error 157 when reading table './ps/actions' 110224 10:33:21 [ERROR] Got error 157 when reading table './ps/cats' 110224 10:33:21 [Note] NDB Binlog: cluster failure for ./mysql/ndb_apply_status at epoch 28285206/7. 110224 10:33:21 [Note] Restarting Cluster Binlog 110224 10:33:21 [ERROR] /usr/local/mysql/bin/mysqld: Incorrect information in file: './ps/cats.frm' 110224 10:33:21 [ERROR] /usr/local/mysql/bin/mysqld: Incorrect information in file: './ps/users_places.frm' 110224 10:33:21 [ERROR] Got error 157 when reading table './ps/users' 110224 10:33:21 [ERROR] Got error 157 when reading table './ps/actions' 110224 10:33:21 [ERROR] /usr/local/mysql/bin/mysqld: Incorrect information in file: './ps/cats.frm' 110224 10:33:21 [ERROR] /usr/local/mysql/bin/mysqld: Incorrect information in ....... 110224 10:33:22 [ERROR] /usr/local/mysql/bin/mysqld: Incorrect information in file: './ps/cats_equipments.frm' 110224 10:33:22 [ERROR] /usr/local/mysql/bin/mysqld: Incorrect information in file: './ps/aa_nodes.frm' ....... 110224 10:33:23 [ERROR] /usr/local/mysql/bin/mysqld: Incorrect information in file: './ps/users.frm' ....... 110224 10:33:24 [Note] table './ps/actions' opened read only 110224 10:33:24 [Note] table './ps/aa_nodes' opened read only 110224 10:33:24 [Note] table './ps/actions' opened read only 110224 10:33:24 [Note] table './ps/banners' opened read only 110224 10:33:24 [Note] table './ps/actions_status' opened read only 110224 10:33:24 [Note] table './ps/actions_status' opened read only 110224 10:33:24 [Note] NDB Binlog: DISCOVER TABLE Event: REPL$mysql/ndb_schema 110224 10:33:25 [Note] NDB Binlog: logging ./mysql/ndb_schema (UPDATED,USE_WRITE) 110224 10:33:25 [Note] NDB Binlog: DISCOVER TABLE Event: REPL$mysql/ndb_apply_status 110224 10:33:25 [Note] NDB Binlog: logging ./mysql/ndb_apply_status (UPDATED,USE_WRITE) 110224 10:33:25 [Note] table './ps/actions_popup' opened read only 110224 10:33:25 [Note] table './ps/users' opened read only 110224 10:33:25 [Note] table './ps/cats' opened read only 110224 10:33:26 [Note] table './ps/company_type_desc' opened read only 110224 10:33:26 [Note] table './ps/company_type_desc' opened read only 110224 10:33:26 [Note] table './ps/actions_status' opened read only 110224 10:33:26 [Note] table './ps/actions_popup' opened read only 2011-02-24 10:33:26 [NdbApi] INFO -- Flushing incomplete GCI:s < 28285211/3 2011-02-24 10:33:26 [NdbApi] INFO -- Flushing incomplete GCI:s < 28285211/3 110224 10:33:26 [Note] NDB Binlog: starting log at epoch 28285211/3 110224 10:33:26 [Note] NDB Binlog: ndb tables writable 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 2, subscribe from node 6, Subscriber bitmask 040 110224 10:33:26 [Note] NDB Binlog: Node: 3, subscribe from node 6, Subscriber bitmask 040
[17 May 2011 0:38]
Atsushi Terada
Hi, Sebastian. I have just same problem as you, and, unfortunately, have not yet solved... I would like to share this problem with you and am sure that it helps to resolve it. Well, would you tell me whether you change default setting, and if any, please tell me each of that. I come across the problem by default setting. I assume that it causes from query that has many JOIN, because mysql cluster is not good at joining tables. In addition, have you ever seen "Out of SendBufferMemory" error message before? If you have, I appreciate how you came up with the solution. Regards, Atsushi