Bug #17788 | LCP should start on out of Redo, ndb_restore should retry more | ||
---|---|---|---|
Submitted: | 28 Feb 2006 15:57 | Modified: | 11 Sep 2009 7:33 |
Reporter: | Johan Andersson | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S3 (Non-critical) |
Version: | mysql-4.1 | OS: | Any (*) |
Assigned to: | CPU Architecture: | Any | |
Tags: | 4.1-> |
[28 Feb 2006 15:57]
Johan Andersson
[22 Jan 2007 12:14]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/18534 ChangeSet@1.2539, 2007-01-22 20:11:07+08:00, gni@dev3-221.dev.cn.tlan +7 -0 BUG#17788 ndb_restore has more 'adaptive' functions. When the 410 temperary error occurs, It will send LCP immediately start signal.
[24 Jan 2007 10:25]
Guangbao Ni
how to reproduce it: 1. create a table and insert records into it. You must ensure that the size of the table is greater than Redo log size.(For example , if you use the default value about TimeBetweenLocalCheckpoints and NoOfFragmentLogFiles, the table size is greater than 800M). You can set a large value for TimeBetweenLocalCheckpoints and a small value for NoOfFragmentLogFiles, and then you can use a small size table. 2. start backup in ndb_mgm 3. restart ndb cluster with --initial option 4. ndb_restore it with the backup data During the process of ndb_restore, you will get the error message
[29 Mar 2007 7:44]
Stewart Smith
What's the status with this bug? Last conversation I can find is at the end of January (and I think we had some IRC discussion too). Basically saying that we should be able to trigger the start LCP from kernel on error instead... with me not liking the use of the dump interface here. current status?
[24 Apr 2007 11:20]
Guangbao Ni
Hi Jonas, Stewart suggests that i should define a new signal to start LCP, and if i use NDB API, i should add new interface in the Ndb class, or to trigger the start LCP from kernel on error instead. whichever method i adopted, the new signals must be defined. what do you think? please give your suggestion. thanks! /Guangbao Ni
[27 Jun 2007 4:00]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/29661 ChangeSet@1.2473, 2007-06-27 09:36:59+08:00, gni@dev3-221.dev.cn.tlan +9 -0 BUG#17788 ndb_restore is too static in its behavior.
[6 Jul 2007 4:21]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/30420 ChangeSet@1.2473, 2007-07-06 11:49:16+08:00, gni@dev3-221.dev.cn.tlan +9 -0 BUG#17788 ndb_restore is too static in its behavior.
[19 Jul 2007 5:14]
Stewart Smith
Looks okay to me. Since Jonas is on vacation, Pekka - can you have a quick look too? I think we should only apply this to 5.1 though. I would still like a test case.
[27 Jul 2007 5:09]
Stewart Smith
please also check what ndb_restore does in the temporary error situation... does it retry for long enough? or does it give up at some point? if it gives up.... this is a problem with large LCP
[27 Jul 2007 10:06]
Guangbao Ni
Hi Stewart, Before fixed, it will abort after 10 retries for the same transaction. the patch is to solve the problem, make it be self-recoverable from the temperaary error.
[6 Aug 2007 2:02]
Stewart Smith
I think we should continue to retry (not limit it to 10). Naturally displaying some kind of warning though.
[14 Aug 2007 3:37]
Stewart Smith
Setting back to In Progress as still something to be done.
[15 Aug 2007 8:26]
Guangbao Ni
Hi Stewart, if a test case wants to insert a error to ndbd kernel, it will use the NdbTamper() (NDBAPI) and NDB_TAMPER signal? the test case should be put in ndb/test/ndbapi directory?
[17 Aug 2007 1:44]
Stewart Smith
there's an mgmapi function to do it: /** * Provoke an error. * * @param handle the NDB management handle. * @param nodeId the node id. * @param errrorCode the errorCode. * @param reply the reply message. * @return 0 if successful or an error code. */ int ndb_mgm_insert_error(NdbMgmHandle handle, int nodeId, int errorCode, struct ndb_mgm_reply* reply);
[6 Sep 2007 1:44]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/33774 ChangeSet@1.2473, 2007-09-06 09:36:12+08:00, gni@dev3-221.dev.cn.tlan +11 -0 BUG#17788 LCP should start on out of Redo, ndb_restore should retry more.
[12 Nov 2007 2:55]
Stewart Smith
I think the test program is missing from the patch. also, this makes retries==100, not "infinite" in restore.
[14 Nov 2007 2:38]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/37720 ChangeSet@1.2473, 2007-11-14 10:24:52+08:00, gni@dev3-221.dev.cn.tlan +12 -0 BUG#17788 LCP should start on out of Redo, ndb_restore should retry more.