Bug #63822 Failed to create UNDOFILE
Submitted: 21 Dec 2011 11:59 Modified: 9 Jan 2012 10:38
Reporter: Lucas Brandstaetter Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Cluster: Disk Data Severity:S1 (Critical)
Version:ndb-7.1.17 OS:Linux (RedHat EL6)
Assigned to: CPU Architecture:Any

[21 Dec 2011 11:59] Lucas Brandstaetter
Description:
When trying to create a UNDOFILE on our cluster environment with the following command:

CREATE LOGFILE GROUP lg_1
ADD UNDOFILE 'undo_1.dat'
INITIAL_SIZE 16M
UNDO_BUFFER_SIZE 2M
ENGINE NDB;

We recieve the following Error:
ERROR 1528 (HY000): Failed to create UNDOFILE

show warnings:
*************************** 1. row ***************************
  Level: Error
   Code: 1296
Message: Got error 1509 'File system error, check if path,permissions etc' from NDB
*************************** 2. row ***************************
  Level: Error
   Code: 1528
Message: Failed to create UNDOFILE
2 rows in set (0.00 sec)

When executing an ls -la on the ndb_#_fs we see the files present:
[root@hostname /mnt/data/ndb_10_fs]# ls -la
total 16420
drwxr-x---. 9 root root     4096 Dec 21 12:45 .
drwxr-xr-x. 5 root root     4096 Dec 21 12:04 ..
drwxr-x---. 4 root root     4096 Dec 21 12:05 D1
drwxr-x---. 3 root root     4096 Dec 21 12:04 D10
drwxr-x---. 3 root root     4096 Dec 21 12:04 D11
drwxr-x---. 4 root root     4096 Dec 21 12:05 D2
drwxr-x---. 3 root root     4096 Dec 21 12:04 D8
drwxr-x---. 3 root root     4096 Dec 21 12:04 D9
drwxr-x---. 4 root root     4096 Dec 21 12:24 LCP
-rw-r--r--. 1 root root 16777216 Dec 21 12:45 undo_1.dat

The behavior doesn't change when giving chmod 777 access rights to the ndb_#_fs directory.

Here the config.ini for our Cluster

[NDBD DEFAULT]

DataDir=/mnt/log/								# Directory for this data node's log files
FileSystemPath=/mnt/data							# Directory for this data node's data files
BackupDataDir=/mnt/backup/							# Directory for this data node's backup files

[NDB_MGMD DEFAULT]
datadir=/var/lib/mysql-cluster		# Directory for MGM node log files

[ndb_mgmd]
NodeId=1
hostname=<IP Adress>

[ndb_mgmd]
NodeId=2
hostname=<IP Adress>

[ndb_mgmd]
NodeId=3
hostname=<IP Adress>

[ndb_mgmd]
NodeId=4
hostname=<IP Adress>

[ndb_mgmd]
NodeId=5
hostname=<IP Adress>

[mysqld]
NodeId=6
hostname=<IP Adress>

[mysqld]
NodeId=7
hostname=<IP Adress>

[mysqld]
NodeId=8
hostname=<IP Adress>

[mysqld]
NodeId=9
hostname=<IP Adress>

[ndbd]
NodeId=10
hostname=<IP Adress>

[ndbd]
NodeId=11
hostname=<IP Adress>

[ndbd]
NodeId=12
hostname=<IP Adress>

[ndbd]
NodeId=13
hostname=<IP Adress>

SeLinux is disabled on all Servers.
We've tried to reduce the number of mgm an ndbd nodes but it didn't change the ourcome.
We've also tried to use a different directory or no defined directory at all, both lead to no change in behavior.
The filesystem for all Nodes is ext4

How to repeat:
This is a fresh cluster setup with a bare minimum configuration.

Starting the cluster with the following commands:
ndb_mgmd -f /var/lib/mysql-cluster/config.ini --nowait-nodes 3,4,5 --config-cache --initial
ndbmtd --nostart
ndb_mgm> ALL START
/etc/init.d/mysql start

Then issue the following mysql commands on any mysql nodes in the cluster:
create database ndb;
use ndb;

CREATE LOGFILE GROUP lg_1
ADD UNDOFILE 'undo_1.dat'
INITIAL_SIZE 16M
UNDO_BUFFER_SIZE 2M
ENGINE NDB;
[22 Dec 2011 10:00] Lucas Brandstaetter
We've debugged the disk access via the strace command and found that the storage nodes produce a Permission denied:

strace -e trace=file -p 27155 -f 2>&1 | grep undo
[pid 27176] open("/mnt/data/ndb_10_fs/undo_1.dat", O_RDWR) = -1 ENOENT (No such file or directory)
[pid 27176] open("/mnt/data/ndb_10_fs/undo_1.dat", O_RDWR|O_CREAT|O_DIRECT, 0666) = 31
[pid 27176] open("/mnt/data/ndb_10_fs/undo_1.dat", O_RDWR|O_SYNC|O_DIRECT) = -1 EACCES (Permission denied)
[9 Jan 2012 10:38] Lucas Brandstaetter
There was a problem with a virus scanner that could be solved by a system administrator