Bug #95023 NDB Node won't start with tablespace (Windows)
Submitted: 15 Apr 2019 12:02 Modified: 6 May 2019 11:31
Reporter: Vincent Guillemette Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:8.0.14-dmr/8.0.16-dmr OS:Windows (Server 2016)
Assigned to: CPU Architecture:Other (x64)
Tags: ndb, Tablespace, windows

[15 Apr 2019 12:02] Vincent Guillemette
Description:
Hello, 

As title suggests, after I created a tablespace and associated it to a database table, my ndbmtd processes no longer start. 

I followed this piece of documentation to set up the disk data table : https://dev.mysql.com/doc/refman/8.0/en/mysql-cluster-disk-data-objects.html. 

When trying to (re)start any ndb process, it exits, and the following informations are logged : 

---------
ndb_mgm : 
---------
2019-04-15 13:29:48 [MgmtSrvr] INFO     -- Node 25: Node 149 Connected
2019-04-15 13:29:48 [MgmtSrvr] INFO     -- Node 25: Buffering maximum epochs 100
2019-04-15 13:29:48 [MgmtSrvr] INFO     -- Node 25: Start phase 0 completed (system restart)
2019-04-15 13:29:48 [MgmtSrvr] INFO     -- Node 25: Waiting 30 sec for nodes 1 to connect, nodes [ all: 1 and 25 connected: 25 no-wait:  ]
2019-04-15 13:29:48 [MgmtSrvr] INFO     -- Node 25: Communication to Node 1 opened
2019-04-15 13:29:50 [MgmtSrvr] INFO     -- Node 25: Node 1 Connected
2019-04-15 13:29:50 [MgmtSrvr] INFO     -- Node 25: CM_REGCONF president = 1, own Node = 25, our dynamic id = 0/36
2019-04-15 13:29:50 [MgmtSrvr] INFO     -- Node 25: Node 1: API mysql-8.0.14 ndb-8.0.14
2019-04-15 13:29:50 [MgmtSrvr] INFO     -- Node 25: Start phase 1 completed (system restart)
2019-04-15 13:29:50 [MgmtSrvr] INFO     -- Node 25: Start phase 2 completed (node restart)
2019-04-15 13:29:50 [MgmtSrvr] INFO     -- Node 25: Start phase 3 completed (node restart)
2019-04-15 13:29:50 [MgmtSrvr] INFO     -- Node 25: Start phase 4 completed (node restart)
2019-04-15 13:29:50 [MgmtSrvr] INFO     -- Node 25: Starting to restore schema
2019-04-15 13:29:51 [MgmtSrvr] ALERT    -- Node 25: Forced node shutdown completed. Occured during startphase 5. Caused by error 2812: 'Invalid parameter for file(Configuration error). Permanent error, external action needed'.

--------
ndbmtd : 
--------
Time: Monday 15 April 2019 - 13:29:50
Status: Permanent error, external action needed
Message: Invalid parameter for file (Configuration error)
Error: 2812
Error data: TSMAN: File system close failed. OS errno: 6
Error object: TSMAN (Line: 3914) 0x00000002
Program: ndbmtd.exe
Pid: 4568 thr: 1
Version: mysql-8.0.14 ndb-8.0.14-dmr
Trace file name: ndb_25_trace.log.2_t1
Trace file path: C:/MySQL/Data/cluster-data\ndb_25_trace.log.2 [t1..t4]
***EOM***

------------
config.ini : 
------------
[ndbd default]
NoOfReplicas=2
DataDir=C:/MySQL/Data/cluster-data
DataMemory=1G
ServerPort=50501

[ndb_mgmd default]
ArbitrationRank=0
DataDir=C:/MySQL/Mgmt/cluster-logs

[mysqld default]
ArbitrationRank=0

[ndbd]
NodeId=1
HostName=datab1-vip.xxx

[ndbd]
NodeId=25
HostName=datap1-vip.xxx

[ndb_mgmd]
ArbitrationRank=2
NodeId=101
HostName=mgmtb1-vip.xxx

[ndb_mgmd]
NodeId=125
HostName=mgmtp1-vip.xxx

[ndb_mgmd]
ArbitrationRank=1
NodeId=149
HostName=mgmto1.xxx

[mysqld]
NodeId=201
HostName=datab1-vip.xxx

[mysqld]
NodeId=225
HostName=datap1-vip.xxx

Then, the only way to get the node started is to start from scratch (delete cluster-data contents), which takes a huge amount of time... 

Let me know should you need any more information. 

Regards, 
Vincent. 

How to repeat:
Once a tablespace has been created and associated to a table, try and restart any ndb node.
[15 Apr 2019 12:05] Vincent Guillemette
ndb_trace.log

Attachment: ndb_25_trace.log.2 (application/octet-stream, text), 634.70 KiB.

[15 Apr 2019 12:05] Vincent Guillemette
ndb_trace.log

Attachment: ndb_25_trace.log.2_t1 (application/octet-stream, text), 58.44 KiB.

[15 Apr 2019 12:05] Vincent Guillemette
ndb_trace.log

Attachment: ndb_25_trace.log.2_t2 (application/octet-stream, text), 952.61 KiB.

[15 Apr 2019 12:06] Vincent Guillemette
ndb_trace.log

Attachment: ndb_25_trace.log.2_t3 (application/octet-stream, text), 9.79 KiB.

[16 Apr 2019 22:23] MySQL Verification Team
Hi,

This looks like some windows glitch. Are you sure there's no antivirus running and messing up with disk data?

all best
Bogdan
[17 Apr 2019 6:00] Vincent Guillemette
Hello, 

No antivirus software has been installed on this server. 

I tried disabling Windows Defender's realtime protection, but the problem persists.

Regards, 
Vincent.
[17 Apr 2019 13:14] MySQL Verification Team
Hi,
Weird, I'm not reproducing this. Have you tried starting a node initial?

all best
Bogdan
[17 Apr 2019 13:18] Vincent Guillemette
Hi, 

As stated in my initial post, the node will start with the "initial" flag. 

However, it is not a viable solution as it needs to fetch the whole data set from the other group's node. 

Regards, 
Vincent.
[17 Apr 2019 13:39] MySQL Verification Team
Hi,

> As stated in my initial post, the node will start with the "initial" flag. 

I apologize, I understood that this:

> Then, the only way to get the node started is to start from scratch (delete cluster-data contents), which takes a huge amount of time... 

meant that you are deleting all cluster data, my bad, apologies.

So with --initial you get the node up. 

Let me get this straight, as I'm having issues reproducing this,

- you get cluster up, set it up with data
- you create log file group
- you create tablespace
- you then create new tables with disk data or you alter existing table?

at this point everything works
next time you restart data node it fails to start so you have to start it --initial in order for it to boot up ?

thanks
Bogdan
[17 Apr 2019 13:53] Vincent Guillemette
Hi, 

I'm sorry, you were right. 
The node starts when clearing cluster-data directory. 

You also got the events order right : the table is a newly created one. 

At this point : 
- When I restart data node without any flag, it fails to start. 
- When I restart data node with --initial flag, it fails to start with the following error : 
  Couldn't start as daemon, error: 'Failed to open logfile 'D:/MySQL/Data/cluster-data\ndb_1_out.log' for write, errno: 13'

Regards, 
Vincent.
[17 Apr 2019 14:00] Vincent Guillemette
Hi, 

Nevermind, I started the daemon with the wrong account on my last attempt. 

Starting with the --initial flag produces the same result as clearing the cluster-data directory. 
The data node starts, but fetch the whole data set from the other data node in the group. 

Regards, 
Vincent.
[17 Apr 2019 14:08] MySQL Verification Team
Hi,

Once you start it with initial, will it restart properly next time (no flags) or you need to do initial every time you restart it?

thanks
Bogdan
[17 Apr 2019 14:13] Vincent Guillemette
Hi, 

I need to start the data node with the --initial flag every time. 

Regards, 
Vincent.
[3 May 2019 6:34] Vincent Guillemette
Hi, 

If it's of any help, issue persists in version 8.0.16-dmr. 

Regards, 
Vincent.
[3 May 2019 6:34] Vincent Guillemette
8.0.16-dmr trace logs

Attachment: mysql-bug-8-0-16.zip (application/x-zip-compressed, text), 79.91 KiB.

[6 May 2019 11:31] MySQL Verification Team
Verified on windows only

thanks for submitting the bug

all best
Bogdan
[31 Oct 2019 13:48] Mauritz Sundell
Posted by developer:
 
Should be same bug as fixed  Bug #30484272 	USING DISK DATA ON WINDOWS FAILS NODE RESTART: TSMAN: FILE SYSTEM CLOSE FAILED.