Bug #54423 Ndb : ndb_restore is too strict about schema matches
Submitted: 11 Jun 2010 9:35 Modified: 7 Jul 2010 14:22
Reporter: Frazer Clement Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.3 OS:Any
Assigned to: Frazer Clement CPU Architecture:Any

[11 Jun 2010 9:35] Frazer Clement
Description:
ndb_restore can be used to restore either schema metadata, table data or both from an Ndb backup into a running Cluster.

When table data is being restored and a table already exists in the Cluster, ndb_restore verifies that the schema of the data captured in the backup and the data in the running Cluster are the same.  This is necessary for it to avoid inserting junk data into the Cluster.

Currently there are two exception options which can be explicitly allowed for :
  --exclude-missing-columns
    This allows either :
      - Extra Columns in the backup which are not in the running Cluster - they will be ignored during restore.
      - Extra Columns in the running Cluster which are not in the backup.  They must be nullable and will be set to null.

  --attribute-promotion
    This supports a set of type mappings where data of the type in the backup can be safely and correctly represented in the type in the running cluster.  Examples include Int -> BigInt, Varchar(10) -> Varchar(20) etc.

When comparing two columns for equality, ndb_restore uses the Column::equal() method which compares members of the column for equality.  Some of these members do not have to be equal for ndb_restore to be meaningful, safe and useful.  For example :
 - Dynamic       : Is the column stored in dynamic format or not
 - Storage type  : Is the column stored on disk or in memory
 - Default value : What is the column's default value

ndb_restore should be modified to ignore these column attributes which do not necessarily have to match between the backup and the running cluster.  To minimise the chance of user error, all 'ignored' differences should be reported.

How to repeat:
1) Create a table with all columns in memory
2) Add some data
3) Perform backup
4) Truncate table
5) Modify to have some columns stored on disk
6) Restore the backup (data only) --restore-data
7) Observer errors in ndb_restore due to some columns being on-disk now.

Suggested fix:
Make ndb_restore detect differences specifically, report them, and ignore those that do not matter.
[11 Jun 2010 9:38] Frazer Clement
Related : 

Bug#54242 ndb native default values break ndb_restore
Bug#54279 failing compatibility checks in ndb_restore attribute promotion
Bug#54178 SHOW CREATE TABLE does not show column format FIXED/DYNAMIC if set implicitly
Bug#53810 ndb_restore need same conversion between data types like replication
[11 Jun 2010 10:29] Frazer Clement
Changing affected version to 6.3
[11 Jun 2010 10:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/110805

3214 Frazer Clement	2010-06-11
      Bug#54423 Ndb : ndb_restore is too strict about schema matches
      
      ndb_restore now produces more information about mismatches, and accepts :
       - Different 'Dynamic' setting
       - Different storage type (memory/disk)
       - Different default values
       - Different 'distribution key' setting
[18 Jun 2010 9:06] Frazer Clement
Fix pushed to 
6.3.35
7.0.16
7.1.5
[7 Jul 2010 14:22] Jon Stephens
Documented feature change in the NDB-6.3.35, 7.0.16, and 7.1.5 changelogs, as follows:

        Restrictions on some types of mismatches in column definitions
        when restoring data using ndb_restore have been relaxed. These
        include the following types of mismatches:

            * Different COLUMN_FORMAT settings (FIXED, DYNAMIC, 
            DEFAULT)

            * Different STORAGE settings (MEMORY, DISK)

            * Different default values

            * Different distribution key settings

        Now, when one of these types of mismatches in column definitions
        is encountered, ndb_restore no longer stops with an error;
        instead, it accepts the data and inserts it into the target
        table, but issues a warning to the user.

Closed.