Bug #21815 | mysqld is not informed of cluster shutdown, making slave thread print errors | ||
---|---|---|---|
Submitted: | 24 Aug 2006 18:04 | Modified: | 26 Dec 2006 16:33 |
Reporter: | Jonathan Miller | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Cluster: Replication | Severity: | S3 (Non-critical) |
Version: | mysql-5.1 | OS: | Linux (Linux) |
Assigned to: | CPU Architecture: | Any | |
Tags: | 5.1.12 |
[24 Aug 2006 18:04]
Jonathan Miller
[24 Aug 2006 18:41]
Jonas Oreland
there is no HA property of ndb_mgmd, they can die/restart/die etc. wo/ affecting ndbapi/mysqld/slave (try for your self) i.e this is not a bug. slave _must_ continue to try to apply until it gets a 4009 back. the fact that ndb_mgmd is "outside" the cluster is a nice feature as it brings down no of components that affect total availibility. (hmm was that correcly spelled)
[24 Aug 2006 19:17]
Jonathan Miller
This has nothing to do with ndb_mgmd, this has to do with mysqld seeing the cluster is gone and stopping the SQL thread correctly instead of continuing to try to do writes. /jeb
[24 Aug 2006 19:32]
Jonas Oreland
I dont understand (comment marked with ***) ** these 3 statements does not mean that cluster is down Management server closed connection early. It is probably being shut down (or has problems). We will retry the connection. Management server closed connection early. It is probably being shut down (or has problems). We will retry the connection. Management server closed connection early. It is probably being shut down (or has problems). We will retry the connection. But yet the slave process (i.e. The SQL thread) still tries to insert into the database anyway and failes: ** this does not mean that cluster has shutdown 060824 14:26:01 [ERROR] Slave: Error in Write_rows event: error during transaction execution on table dbt2.district, Error_code: 4023 ** this does not mean that cluster has shutdown 060824 14:26:10 [ERROR] Slave: Error 'Got temporary error 286 'Node failure caused abort of transaction' from NDBCLUSTER' on query. Default database: ''. Query: 'COMMIT', Error_code: 1297 ** this does not mean that cluster has shutdown 060824 16:17:53 [ERROR] Slave: Error in Write_rows event: error during transaction execution on table dbt2.order_line, Error_code: 4023 060824 16:17:53 [ERROR] Slave: Error in Write_rows event: when locking tables, Error_code: -1 ** these 4 (at the same time) means that mysqld(slave) has lost connection to cluster 060824 16:17:53 [ERROR] Slave (additional info): Can't lock file (errno: 4009) Error_code: 1015 060824 16:17:53 [Warning] Slave: Got error 4009 'Cluster Failure' from NDB Error_code: 1296 060824 16:17:53 [Warning] Slave: Can't lock file (errno: 4009) Error_code: 1015 060824 16:17:53 [Warning] Slave: Unknown error Error_code: 1105 --- 1) does it continue to issue warnings/errors after that ? 2) or do you want the feature that ndbd should syncronize with what-ever mysqlds out-there during graceful shutdown?
[25 Aug 2006 13:23]
Jonathan Miller
Here is another example that I did today: 060825 15:17:26 [ERROR] Slave: Error in Delete_rows event: row application failed, Error_code: 4023 060825 15:17:26 [ERROR] Slave: Error in Delete_rows event: error during transaction execution on table dbt2.stock, Error_code: 4023 060825 15:17:26 [ERROR] Slave: Error in Write_rows event: when locking tables, Error_code: -1 060825 15:17:26 [ERROR] Slave (additional info): Can't lock file (errno: 4009) Error_code: 1015 060825 15:17:26 [Warning] Slave: Got error 4009 'Cluster Failure' from NDB Error_code: 1296 060825 15:17:26 [Warning] Slave: Can't lock file (errno: 4009) Error_code: 1015 060825 15:17:26 [Warning] Slave: Unknown error Error_code: 1105 060825 15:17:26 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'master2.000011' position 4532994 Management server closed connection early. It is probably being shut down (or has problems). We will retry the connection. Management server closed connection early. It is probably being shut down (or has problems). We will retry the connection. Management server closed connection early. It is probably being shut down (or has problems). We will retry the connection. restart the ndb_mgmd or even stopping does not produce any errors on the mysqld side. The cluster should communicate gracefull shutdown to the mysqld so that "failure" messages are not printed in the mysqld error log.
[25 Aug 2006 14:12]
Jonas Oreland
Jonas questions: 1) does it continue to issue warnings/errors after that ? 2) or do you want the feature that ndbd should syncronize with what-ever mysqlds out-there during graceful shutdown? I dont know answer on 1) But I think you mean yes on 2) --- 2) is a sane feature-request... Actually management of the entire replication solution sucks imho, and I think one should compile a full list of new features that will make it easier to handle in general, and for cluster in particular. And the prioritize this list. But i doubt any such thing will happen before tomas is back. --- a small note: changing title on this bug report As mysqld it self is not informed of graceful cluster shutdown... I change to it "mysqld is not informed of cluster shutdown, making slave print errors" /Jonas
[25 Aug 2006 14:43]
Jonathan Miller
1) does it continue to issue warnings/errors after that ? No, it does not.