Bug #114516 ndbmtd with useSHM failing with Signal 6 received; Aborted
Submitted: 30 Mar 20:23 Modified: 3 Apr 8:36
Reporter: Rafal Hryniow Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:8.0.31 OS:Any
Assigned to: CPU Architecture:x86
Tags: sharedmemory, useshm

[30 Mar 20:23] Rafal Hryniow
Description:
Whenever we want to start ndbtm with UseSHM=1 on differnet OSes we are getting Signal 6 error and following trace

2024-02-23 07:32:23 [ndbd] WARNING  -- TR : line: 242 : connect_server_impl failed
/var/lib/pb2/sb_1-8448725-1663108524.53/release/mysql-cluster-gpl-8.0.31/storage/ndb/src/common/transporter/SHM_Transporter.cpp:362: require((!setupBuffersDone)) failed
For help with below stacktrace consult:
https://dev.mysql.com/doc/refman/en/using-stack-trace.html
Also note that stack_bottom and thread_stack will always show up as zero.
Base address/slide: 0x5631e924a000
With use of addr2line, llvm-symbolizer, or, atos, subtract the addresses in
stacktrace with the base address before passing them to tool.
For tools that have options for slide use that, e.g.:
llvm-symbolizer --adjust-vma=0x5631e924a000 ...
atos -s 0x5631e924a000 ...
2024-02-23 07:32:23 [ndbd] INFO     -- Received signal 6. Running error handler.
stack_bottom = 0 thread_stack 0x0
ndbmtd(my_print_stacktrace(unsigned char const*, unsigned long)+0x41) [0x5631e97836e1]
ndbmtd(ndb_print_stacktrace()+0x56) [0x5631e9744a06]
ndbmtd(handler_error+0xb3) [0x5631e9391d83]
/lib/x86_64-linux-gnu/libc.so.6(+0x43090) [0x7fc1d7970090]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb) [0x7fc1d797000b]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b) [0x7fc1d794f859]
ndbmtd(+0x13fec7) [0x5631e9389ec7]
ndbmtd(SHM_Transporter::connect_server_impl(ndb_socket_t)+0x21a) [0x5631e97573aa]
ndbmtd(Transporter::connect_server(ndb_socket_t, BaseString&)+0x69) [0x5631e9758849]
ndbmtd(TransporterRegistry::connect_server(ndb_socket_t, BaseString&, bool&, bool&)+0x3a8) [0x5631e97506a8]
ndbmtd(TransporterService::newSession(ndb_socket_t)+0x68) [0x5631e9750998]
ndbmtd(SocketServer::doAccept()+0xf0) [0x5631e96f3810]
ndbmtd(SocketServer::doRun()+0x38) [0x5631e96f3ad8]
ndbmtd(socketServerThread_C+0xd) [0x5631e96f3b5d]
ndbmtd(+0x4f4edc) [0x5631e973eedc]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7fc1d81de609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7fc1d7a4c133]
For help with below stacktrace consult:
https://dev.mysql.com/doc/refman/en/using-stack-trace.html
Also note that stack_bottom and thread_stack will always show up as zero.
Base address/slide: 0x5631e924a000
With use of addr2line, llvm-symbolizer, or, atos, subtract the addresses in
stacktrace with the base address before passing them to tool.
For tools that have options for slide use that, e.g.:
llvm-symbolizer --adjust-vma=0x5631e924a000 ...
atos -s 0x5631e924a000 ...
stack_bottom = 0 thread_stack 0x0
ndbmtd(my_print_stacktrace(unsigned char const*, unsigned long)+0x41) [0x5631e97836e1]
ndbmtd(ndb_print_stacktrace()+0x56) [0x5631e9744a06]
ndbmtd(ErrorReporter::handleError(int, char const*, char const*, NdbShutdownType)+0x33) [0x5631e96db5e3]
ndbmtd(handler_error+0x108) [0x5631e9391dd8]
/lib/x86_64-linux-gnu/libc.so.6(+0x43090) [0x7fc1d7970090]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb) [0x7fc1d797000b]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b) [0x7fc1d794f859]
ndbmtd(+0x13fec7) [0x5631e9389ec7]
ndbmtd(SHM_Transporter::connect_server_impl(ndb_socket_t)+0x21a) [0x5631e97573aa]
ndbmtd(Transporter::connect_server(ndb_socket_t, BaseString&)+0x69) [0x5631e9758849]
ndbmtd(TransporterRegistry::connect_server(ndb_socket_t, BaseString&, bool&, bool&)+0x3a8) [0x5631e97506a8]
ndbmtd(TransporterService::newSession(ndb_socket_t)+0x68) [0x5631e9750998]
ndbmtd(SocketServer::doAccept()+0xf0) [0x5631e96f3810]
ndbmtd(SocketServer::doRun()+0x38) [0x5631e96f3ad8]
ndbmtd(socketServerThread_C+0xd) [0x5631e96f3b5d]
ndbmtd(+0x4f4edc) [0x5631e973eedc]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7fc1d81de609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7fc1d7a4c133]
2024-02-23 07:32:23 [ndbd] INFO     -- Signal 6 received; Aborted
2024-02-23 07:32:23 [ndbd] INFO     -- /var/lib/pb2/sb_1-8448725-1663108524.53/release/mysql-cluster-gpl-8.0.31/storage/ndb/src/kernel/ndbd.cpp

How to repeat:
Install MySQL NDB Cluster software
(The following script tested in the Ubuntu 22.04 EC2 instance)
sudo apt-get update
mkdir installer && cd installer
wget "https://dev.mysql.com/get/Downloads/MySQL-Cluster-8.0/mysql-cluster_8.0.36-1ubuntu22.04_am..."
tar -xf *.tar
sudo dpkg -i *.deb
# Fix missing dependencies
sudo apt-get -f install

# Do not auto start default mysqld
sudo systemctl disable mysql

# Remove apparmor
sudo systemctl stop apparmor
sudo systemctl disable apparmor
sudo apt-get remove --purge apparmor apparmor-utils
# Reboot is required to disable apparmor completely
sudo reboot

Prepare the minimal NDB Cluster config.ini
The following example minimal config.ini assuming:
1 management node with id 1
2 data node with id 11,12
2 sql node with id 51,52
All the configuration and data will store in the /ssd NVMe disk (will mount later)
The EC2 has > 16GB RAM, each data node allocate 8GB, less than 16GB RAM will fail.
[ndbd default]
NoOfReplicas=2
UseShm=1 
DataMemory=8G
AutomaticThreadConfig=1
NumCPUs=2

[ndb_mgmd]
NodeId=1
HostName=localhost
DataDir=/ssd/ndb_mgmd.1

[ndbd]
NodeId=11
HostName=localhost
DataDir=/ssd/ndbd.11

[ndbd]
NodeId=12
HostName=localhost
DataDir=/ssd/ndbd.12

[mysqld]
NodeId=51
HostName=localhost

[mysqld]
NodeId=52
HostName=localhost

[mysqld]
NodeId=53
HostName=localhost

[mysqld]
NodeId=54 
HostName=localhost

[mysqld]
NodeId=55
HostName=localhost

[mysqld]
NodeId=56
HostName=localhost

[mysqld]
NodeId=57
HostName=localhost

[mysqld]
NodeId=58
HostName=localhost

Start management node
We start the ndb_mgmd with a ubuntu (non-root) user, which is fine as the directory /ssd/ndb_mgmd.1 is also created/owned by the ubuntu user.
# Kill any existing ndb_mgmd process
pkill -9 ndb_mgmd
ndb_mgmd --configdir="/ssd/ndb_mgmd.1" -f /home/ubuntu/config.ini --reload
Start data nodes
We start the ndbmtd with a ubuntu (non-root) user.
pkill -9 ndbmtd
ndbmtd --ndb-nodeid=11
ndbmtd --ndb-nodeid=12
Then execute the show command in the ndb_mgm, should see the data nodes connected.
[2 Apr 10:26] MySQL Verification Team
Hi,

Thanks for the report, verified as described.

Do you have same issue with ndbd? I could not reproduce with ndbd only with ndbmtd.

Thank you for using MySQL Cluster
[3 Apr 8:36] Rafal Hryniow
I believe we also had issues with ndbd but I would need to rerun and check logs if needed
[3 Apr 13:57] MySQL Verification Team
If you can test with ndbd too I'd appreciate it as I was not able to reproduce using ndbd

Thanks