Bug #115744 UseSHM=true fails NDB Cluster
Submitted: 1 Aug 2024 12:26 Modified: 11 Aug 2024 2:31
Reporter: Maaz Khaleeq Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:8.0.39 OS:Ubuntu (24.04)
Assigned to: CPU Architecture:x86

[1 Aug 2024 12:26] Maaz Khaleeq
Description:
Setup: 1 management node on first EC2 and 1 data node + SQL node on second EC2. This is also reproducible using more data nodes but we're keeping simple to explain. This problem is only reproducible when useShm is set as true in config.ini. In other cases, its working

config.ini
[ndbd default]
# Options affecting ndbd processes on all data nodes:
# https://dev.mysql.com/doc/refman/8.0/en/mysql-cluster-params-ndbd.html
#DataMemory=384G
UseShm=true 
NoOfReplicas=1
#LockPagesInMainMemory=1
AutomaticThreadConfig=1
#NumCPUs=32
[ndb_mgmd]
# Management process options:
hostname=10.90.252.99 # Hostname of the manager
datadir=/var/lib/mysql-cluster  # Directory for the log files
[ndbd]
hostname=10.90.252.122 # Hostname/IP of the first data node
NodeId=2                        # Node ID for this data node
datadir=/usr/local/mysql/data   # Remote directory for the data files
[mysqld]
# SQL node options:
hostname=10.90.252.122

ubuntu@ip-10-90-252-99:/var/lib/mysql-cluster$ ndb_mgm
-- NDB Cluster -- Management Client --
ndb_mgm> SHOW
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]	1 node(s)
id=2 (not connected, accepting connect from 10.90.252.122)
[ndb_mgmd(MGM)]	1 node(s)
id=1	@10.90.252.99  (mysql-8.0.39 ndb-8.0.39)
[mysqld(API)]	1 node(s)
id=3 (not connected, accepting connect from 10.90.252.122)

We observe this in logs on data node
2024-08-01 07:11:53 [MgmtSrvr] ALERT    -- Node 2: Forced node shutdown completed. Initiated by signal 11. Caused by error 6000: 'Error OS signal received(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

Detailed logs: https://docs.google.com/document/d/1HrNFt8ElbrzjqCBw5kzzKMHUTTXzUFMGKGuJ-gA2MWI/edit

How to repeat:
Set up management node in one machine and data node + SQL node in second machine with this config.ini, this is reproducible to us every time.

[ndbd default]
# Options affecting ndbd processes on all data nodes:
# https://dev.mysql.com/doc/refman/8.0/en/mysql-cluster-params-ndbd.html
#DataMemory=384G
UseShm=true 
NoOfReplicas=1
#LockPagesInMainMemory=1
AutomaticThreadConfig=1
#NumCPUs=32
[ndb_mgmd]
# Management process options:
hostname=10.90.252.99 # Hostname of the manager
datadir=/var/lib/mysql-cluster  # Directory for the log files
[ndbd]
hostname=10.90.252.122 # Hostname/IP of the first data node
NodeId=2                        # Node ID for this data node
datadir=/usr/local/mysql/data   # Remote directory for the data files
[mysqld]
# SQL node options:
hostname=10.90.252.122

Details steps that I follow are here: https://docs.google.com/document/d/1HrNFt8ElbrzjqCBw5kzzKMHUTTXzUFMGKGuJ-gA2MWI/edit
[11 Aug 2024 2:31] MySQL Verification Team
Thanks for the report, verified