Bug #54613 Skipping corrupted tables with ndb_restore
Submitted: 18 Jun 2010 11:05 Modified: 13 Dec 2010 2:52
Reporter: Geert Vanderkelen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:mysql-5.1-telco-7.0 OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any

[18 Jun 2010 11:05] Geert Vanderkelen
Description:
It should be possible to restore a MySQL Cluster back when a table is, for example, broken because it misses parts table for BLOB-fields.
Currently, with MySQL Cluster 6.3 and 7.0 (at least 7.0.14), it is possible that a table is not dropped totally, only the parts tables for the BLOB-fields.
This in itself is probably some kind of bug, but it should be possible to exclude this 'broken' table from a backup so we can at least restore the other tables.

How to repeat:
13    UserTable            Online   Yes     ham          def      ndb_t1
3304  TableEvent           Online   -                             NDB$BLOBEVENT_REPL$ham/ndb_t1_1
3305  TableEvent           Online   -                             NDB$BLOBEVENT_REPL$ham/ndb_t1_2
3306  TableEvent           Online   -                             NDB$BLOBEVENT_REPL$ham/ndb_t1_3

The above output of ndb_show_tables (simplified), shows there are BLOBEVENTs, but the parts tables are not there. NDB$BLOB_13_{1,2,3} are missing.

To get into this situation, it was necessary to allow ndb_drop_table dropping the parts tables.
Changing following 2 functions in storage/ndb/src/ndbapi/NdbDictionaryImpl.cpp to return false instead of true:

bool
is_ndb_blob_table(const char* name, Uint32* ptab_id, Uint32* pcol_no)
{
  return false;
  return DictTabInfo::isBlobTableName(name, ptab_id, pcol_no);
}

bool
is_ndb_blob_table(const NdbTableImpl* t)
{
  return false;
  return is_ndb_blob_table(t->m_internalName.c_str());
}

Suggested fix:
A every quick but dirty hack is to use the --skip-unknown-objects flag and --exclude-tables options for ndb_restore.
The attached patch makes the restore continue, although a message is given:

shell> ndb_restore -b 2 -n 2 -m . --skip-unknown-objects --exclude-tables=ham.ndb_t1
..
Table ham/def/ndb_t1 has blob column 1 (c1) with missing parts table in backup.
Connected to ndb!!
Successfully restored table `ndb_hamspam/def/ndb_t1`
Successfully restored table event REPL$ndb_hamspam/ndb_t1
Successfully restored table `test/def/t1`
Successfully restored table event REPL$test/t1
Successfully restored table `test/def/ndb_t1`
Successfully restored table event REPL$test/ndb_t1
Successfully created index `PRIMARY` on `ndb_t1`
Successfully created index `PRIMARY` on `t1`
Successfully created index `PRIMARY` on `ndb_t1`

It would be much better that --exclude-tables would kick in, but with the source right now, that's a big change, or just way to much extra dev.. Maybe we can add an option --force ..?
[18 Jun 2010 11:18] Geert Vanderkelen
bzr diff will follow.. when it's done branching..

A diff using MySQL Cluster 7.1.4b:

--- storage/ndb/tools/restore/Restore.cpp	2010-06-18 11:09:01.000000000 +0200
+++ ../mysql-cluster-gpl-7.1.4b/storage/ndb/tools/restore/Restore.cpp	2010-06-09 12:26:00.000000000 +0200
@@ -295,7 +295,7 @@
   }
   if (!markSysTables())
     return 0;
-  if (!fixBlobs() && !ga_skip_unknown_objects)
+  if (!fixBlobs())
     return 0;
   if(!readGCPEntry())
     return 0;
[18 Jun 2010 11:40] Geert Vanderkelen
Dirty patch using an option which is not supposed to be used for this (and a global!)

Attachment: bug54613_dirty.patch (application/octet-stream, text), 424 bytes.

[3 Dec 2010 9:06] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/125899

3359 Jonas Oreland	2010-12-03
      ndb - bug#54613 - add new option --skip-broken-object that allows ndb_restore to carry on even if finding corrupt tables in backup file (currently it only handles case with missing blob-tables)
[3 Dec 2010 9:32] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/125904

3359 Jonas Oreland	2010-12-03
      ndb - bug#54613 - add new option --skip-broken-object that allows ndb_restore to carry on even if finding corrupt tables in backup file (currently it only handles case with missing blob-tables)
[3 Dec 2010 9:47] Bugs System
Pushed into mysql-5.1-telco-6.3 5.1.51-ndb-6.3.40 (revid:jonas@mysql.com-20101203092944-gk7c8777qj1gycu0) (version source revid:jonas@mysql.com-20101203092944-gk7c8777qj1gycu0) (merge vers: 5.1.51-ndb-6.3.40) (pib:23)
[3 Dec 2010 9:48] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.21 (revid:jonas@mysql.com-20101203093645-6k4l9nmu50xz7zbv) (version source revid:jonas@mysql.com-20101203093645-6k4l9nmu50xz7zbv) (merge vers: 5.1.51-ndb-7.0.21) (pib:23)
[3 Dec 2010 12:44] Jonas Oreland
pushed to 6.3.40, 7.0.21 and 7.1.10
[13 Dec 2010 2:52] Jon Stephens
Documented in the NDB-6.3.40, 7.0.21, and 71.10 changelogs as follows:

        Added the --skip-broken-objects option for ndb_restore. This 
        option causes ndb_restore to ignore tables corrupted due to
        missing blob parts tables, and to continue reading from the 
        backup file and restoring the remaining tables.

Also updated ndb_restore description and related portions of Cluster docs with info about the new option.

Closed.