MySQL Bugs: #64302: ndb_restore does not repartition data

Bug #64302	ndb_restore does not repartition data
Submitted:	11 Feb 2012 14:08	Modified:	13 Feb 2012 19:55
Reporter:	Kolbe Kegel	Email Updates:
Status:	Open	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S4 (Feature request)
Version:	7.2.2	OS:	Any
Assigned to:	Assigned Account	CPU Architecture:	Any
Tags:	Backup, cluster, ndb_restore, partition

Description:
My understanding of the behavior of ndb_restore has always been that it "inserts" data into the destination cluster in such a way that the data is (re-)partitioned across the nodes & nodegroups that exist in the destination cluster without regard for the configuration of the source cluster. It seems that this is not the case in 7.2.2.

Perhaps this was changed as part of the implementation of online add node?

Is this intended behavior?

A possible workaround is for users to perform ALTER TABLE ... REORGANIZE PARTITION after restoring a table using ndb_restore, but that is cumbersome and should be unnecessary.

How to repeat:
* Set up 2-node cluster A
* Create table, insert data
* Take NDB backup of cluster A
* Create 4-node cluster B
* Restore backup of cluster A into cluster B
* Confirm that your restored table has only 2 partitions
* Create a new table on cluster B
* Confirm that new table has 4 partitions

I'll attach a log of my work to reproduce this problem.

Suggested fix:
ndb_restore should cause data to be partitioned according to the partitioning scheme of the destination cluster.

scrollback of session to reproduce bug 64302

Attachment: bug_64302_kolbe_scrollback.txt (text/plain), 17.17 KiB.

Current workaround is to ALTER TABLE `tbl_name` REORGANIZE PARTITION; for each table after it has been re-loaded.

Reproduced in 7.1.7, bug likely appears in 7.0 as well.

Hi guys,

This is not a bug. The behaviour has always been like this.

The missing piece is that *schema* should be restored using mysqldump.

If you restore the schema using the .sql file from mysqldump
and the data with ndb_restore, it will repartition.

There are many things that makes us recommend using mysqldump for schema,
e.g triggers.

Closing as not a bug.
/Jonas

Jonas, I clearly remember the opposite behaviour previously. My guess is that this behaviour was introduced in 7.0 with the add data node online feature.

Jonas, thanks for responding. I have several concerns, though.

1) ndb_restore has most certainly *not* "always been like this". I just did the same test using 6.2.15 and the behavior was "as expected" (for myself and Matthew Montgomery and others I've talked to, anyway!).

Before backup:
-- Per partition info --
Partition Row count Commit count Frag fixed memory Frag varsized memory
0 56 56 32768 0
1 44 44 32768 0

After restore:
-- Per partition info --
Partition Row count Commit count Frag fixed memory Frag varsized memory
0 26 26 32768 0
2 30 30 32768 0
3 20 20 32768 0
1 24 24 32768 0

So, ndb_restore behaved quite differently in 6.2.15 than it does now in 7.2.2 (or in 7.1.7, which Matt tested). Is this change intentional? If so, why? Is it documented somewhere?

2) You claim that you "recommend using mysqldump for schema". Even if that is true, I find no mention of this recommendation in the MySQL Reference Manual. Surely if the use of mysqldump for Cluster backups is recommend, that should be discussed in detail somewhere at/under http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-backup.html ?

3) The restore source code seems clearly offer a provision for repartitioning on restore when the table uses the "default number of partitions", followed by a separate block of code that handles the case when the "Table was defined with specific number of partitions". In the case of my test, I didn't specify a "specific number of partitions", so presumably the "default number of partitions" would've been used, and my table should be repartitioned when its restored into the new cluster. Perhaps there is some problem with setting/reading getDefaultNoPartitionsFlag?

See http://bazaar.launchpad.net/~mysql/mysql-server/cluster-7.2/view/head:/storage/ndb/tools/r... :

else if (copy.getDefaultNoPartitionsFlag())
{
/*
Table was defined with default number of partitions. We can restore
it with whatever is the default in this cluster.
We use the max_rows parameter in calculating the default number.
*/

Change bug to feature request.
Asking for "ndb_restore -m --redistribute" (or similar)

Previous behaviour was undocumented...

/Jonas

Perhaps it is worth noting that the "original" behavior (re-partition) does seem to be at least implied in current versions of the MySQL Reference Manual.

From http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-programs-ndb-restore.html ...

It is possible to restore a backup to a database with a different configuration than it was created from. For example, suppose that a backup with backup ID 12, created in a cluster with two database nodes having the node IDs 2 and 3, is to be restored to a cluster with four nodes.

Also, from http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-replication-backups.html ...

"It is not necessary that the slave cluster have the same number of ndbd processes (data nodes) as the master"

That page also explicitly uses the "-m" feature of ndb_restore, which you say ought to be avoided for some reasons that simply do not exist in the documentation anywhere I have been able to find.

Also by the principle of least surprise (which includes a strong bias towards backwards compatibility) a --redistribute option should be the default behavior and the current new behavior should only emerge when explicitly requested by --skip-redistribute ...

(And like Kolbe i'd also like to see where in the manual it actually says "ndb_restore -m bad, mysqldump good" ... i'm not able to find that anywhere while the section on ndb_restore strongly implies that the old behavior can be expected, so i don't agree with "Previous behaviour was undocumented..." that much ...)

agree with the "principle of least surprise" reasoning