Bug #25918 ndb_restore fails when restoring a backup of a disk-data cluster
Submitted: 29 Jan 2007 15:48 Modified: 13 Mar 2009 8:50
Reporter: Rob Kinyon Email Updates:
Status: Patch approved Impact on me:
None 
Category:MySQL Cluster: Disk Data Severity:S1 (Critical)
Version:mysql-5.1 OS:Linux (Ubuntu 6.06 ETS)
Assigned to: Assigned Account CPU Architecture:Any
Tags: 5.1.14, 5.1.22
Triage: Triaged: D2 (Serious) / R1 (None/Negligible) / E1 (None/Negligible)

[29 Jan 2007 15:48] Rob Kinyon
Description:
When running ndb_restore as root, the undo*.dat files cannot be overwritten, thus ndb_restore fails partially through.

How to repeat:
I took a backup of a disk-data cluster, then destroyed the data in the cluster by taking the cluster down and bringing it back up with "ndbd --initial". I then did the following command (with output).

root@bed1:/usr/local/mysql/bin# ./ndb_restore -m -b 1 --backup_path /var/lib/mysql-cluster/BACKUP/BACKUP-1/ -n 2
Backup Id = 1
Nodeid = 2
backup path = /var/lib/mysql-cluster/BACKUP/BACKUP-1/
Ndb version in backup files: Version 5.1.14
Connected to ndb!!
Creating logfile group: lg_2...done
Creating tablespace: ts_2...done
Creating undofile "undo_10.dat"...FAILED
Create undofile failed: undo_10.dat: 1509: File system error, check if path,permissions etc
Restore: Failed to restore table: sys/def/8/username$unique ... Exiting

When I manually deleted the undo files and re-ran ndb_restore, it complained about the tablespace already being there. I had to remove the ndb_2_fs directory and restart the node.
[14 May 2007 18:59] Tomas Ulin
Rob,

the ndb_restore permissions have no impact on this.  It is the ndbd's that create the files.

can you provide us with information about:
1. user running the "ndbd", and what permissions it has

I'm assuming you are getting this problem with any disk based table.  If not:
2. please provide an appropriate schema to reproduce.

furthermore, failing ndb_restore _will_ leave cluster in "half applied" state.

Is this the bug that you want to file?

or that there are some unwanted permission requirements for the ndb's to work?

I.e. unsure what actual bug you are filing.

So please provide:
3. observed behavior
4. what you see as should be expected behavior

BR,

Tomas
[14 Jun 2007 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[13 Feb 2008 9:59] Tobias Asplund
I got hit by this too, here's a reproducable test case.

$ cat config.ini 
===============================================
[NDB_MGMD]
NodeId = 1
HostName = 127.0.0.1
DataDir  = /usr/local/ndb
LogDestination = FILE:filename=my-cluster.log

[NDBD DEFAULT]
HostName = 127.0.0.1
DataDir  = /usr/local/ndb
NoOfReplicas = 2
DataMemory = 20M
IndexMemory = 10M

[NDBD]
NodeId = 5

[NDBD]
NodeId =6

[MYSQLD]
NodeId = 9
[MYSQLD]
NodeId = 10
[MYSQLD]
NodeId = 11
[MYSQLD]
===============================================

$ ndb_mgmd -f config.ini 
$ ndbd --initial
$ ndbd --initial

$ ndb_mgm -e show
Connected to Management Server at: 127.0.0.1:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=5    @127.0.0.1  (Version: 5.1.22, Nodegroup: 0, Master)
id=6    @127.0.0.1  (Version: 5.1.22, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @127.0.0.1  (Version: 5.1.22)

[mysqld(API)]   4 node(s)
id=9    @127.0.0.1  (Version: 5.1.22)
id=10   @127.0.0.1  (Version: 5.1.22)
id=11   @127.0.0.1  (Version: 5.1.22)
id=12 (not connected, accepting connect from any host)

mysql> CREATE DATABASE IF NOT EXISTS Test;
Query OK, 1 row affected (0.00 sec)

mysql> USE Test;
Database changed

mysql> CREATE LOGFILE GROUP lg ADD UNDOFILE 'undo' ENGINE ndb;
Query OK, 0 rows affected (10.26 sec)

mysql> CREATE TABLESPACE ts ADD DATAFILE 'data' USE LOGFILE GROUP lg ENGINE ndb;
Query OK, 0 rows affected (9.06 sec)

mysql> CREATE TABLE t1 ( id INT PRIMARY KEY, a INT, b CHAR(5) ) TABLESPACE ts STORAGE DISK ENGINE=NDB;
Query OK, 0 rows affected (1.28 sec)

mysql> INSERT INTO t1 VALUES (1, 1, 'abc');
Query OK, 1 row affected (0.08 sec)

$ ndb_mgm -e "start backup"
Connected to Management Server at: 127.0.0.1:1186
Waiting for completed, this may take several minutes
Node 5: Backup 1 started from node 1
Node 5: Backup 1 started from node 1 completed
 StartGCP: 192 StopGCP: 195
 #Records: 2056 #LogRecords: 0
 Data: 34612 bytes Log: 0 bytes

$ ndb_mgm -e shutdown
Connected to Management Server at: 127.0.0.1:1186
2 NDB Cluster node(s) have shutdown.
Disconnecting to allow management server to shutdown.

# Modify the config.ini to add two more ndbd groups:

+[NDBD]
+NodeId =7
+[NDBD]
+NodeId =8

$ ndb_mgmd -f config.ini 
$ ndbd --initial
$ ndbd --initial
$ ndbd --initial
$ ndbd --initial

$ ndb_mgm -e show
Connected to Management Server at: 127.0.0.1:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     4 node(s)
id=5    @127.0.0.1  (Version: 5.1.22, Nodegroup: 0, Master)
id=6    @127.0.0.1  (Version: 5.1.22, Nodegroup: 0)
id=7    @127.0.0.1  (Version: 5.1.22, Nodegroup: 1)
id=8    @127.0.0.1  (Version: 5.1.22, Nodegroup: 1)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @127.0.0.1  (Version: 5.1.22)

[mysqld(API)]   4 node(s)
id=9    @127.0.0.1  (Version: 5.1.22)
id=10   @127.0.0.1  (Version: 5.1.22)
id=11   @127.0.0.1  (Version: 5.1.22)
id=12 (not connected, accepting connect from any host)

..BACKUP/BACKUP-1$ ndb_restore -b 1 -r -n 5 -m
Backup Id = 1
Nodeid = 5
backup path = ./
Ndb version in backup files: Version 5.1.22
Connected to ndb!!
Creating logfile group: lg...done
Creating tablespace: ts...done
Creating undofile "undo"...FAILED
Create undofile failed: undo: 1509: File system error, check if path,permissions etc
Restore: Failed to restore table: sys/def/9/PRIMARY ... Exiting 

NDBT_ProgramExit: 1 - Failed
[13 Feb 2008 10:00] Tobias Asplund
My tests were on Mac OS 10.5 with 5.1.22
[22 Mar 2008 10:18] Sveta Smirnova
Thank you for the report.

Verified as Tobias Asplund described.
[16 Apr 2008 7:43] li zhou
The reason is the disk data didn't be removed before restarting the cluster.
The restore can work if remove the undo files and data files in ndb_N_fs directory.

Need we cover the old disk data any way when do restore? 
Or we give warning about that?
Or document that(remove old disk data before restore) in manual?
[16 Apr 2008 7:52] Jonas Oreland
how about adding a flag "--overwrite-files"
(with unknown default)
[16 Apr 2008 12:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/45491

ChangeSet@1.2574, 2008-04-16 19:57:46+00:00, lzhou@dev3-63.(none) +5 -0
  BUG#25918 add new flag '-o' to over write disk files
[21 Apr 2008 2:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/45737

ChangeSet@1.2574, 2008-04-21 10:23:05+00:00, lzhou@dev3-63.(none) +5 -0
  BUG#25918 add new flag '-o' to over write disk files
[13 Mar 2009 8:50] Jonas Oreland
testcases probably needs cleanup now...
[13 Mar 2009 8:50] Jonas Oreland
6.2 is reasonable target