Bug #90908 Error: 2341 - Internal error, DBACC (Line: 1445) in ndbmtd
Submitted: 17 May 2018 9:16 Modified: 27 May 2018 9:27
Reporter: Denis Jdanov Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:ndb-7.5.9 OS:CentOS (7 64 bit)
Assigned to: MySQL Verification Team CPU Architecture:x86
Tags: ndb, ndb_restore, ndbmtd

[17 May 2018 9:16] Denis Jdanov
Description:
During ndb_restore --rebuild-indexes all cluster data nodes crashed with following error:

Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: DbaccMain.cpp
Error object: DBACC (Line: 1445) 0x00000002 Check len failed
Program: ndbmtd
Pid: 40247 thr: 4
Version: mysql-5.7.21 ndb-7.5.9
Trace file name: ndb_12_trace.log.2_t4
Trace file path: /opt/ndbdata/ndb_12_trace.log.2 [t1..t11]
***EOM***

ndb_restore output was following:
Failed to create foreign key groups_operator_id_group_id_FK parent sdr_v3.groups.PK child sdr_v3.sdr_sim_cards.groups_operator_id_group_id_FK: 4009: Cluster Failure

How to repeat:
create required DBs on mysql node:
DROP DATABASE IF EXISTS `auth`;
CREATE DATABASE IF NOT EXISTS `auth` /*!40100 DEFAULT CHARACTER SET utf8 */;
DROP DATABASE IF EXISTS `cg`;
CREATE DATABASE IF NOT EXISTS `cg` /*!40100 DEFAULT CHARACTER SET utf8 */;
DROP DATABASE IF EXISTS `phlr`;
CREATE DATABASE IF NOT EXISTS `phlr` /*!40100 DEFAULT CHARACTER SET latin1 */;
DROP DATABASE IF EXISTS `policies`;
CREATE DATABASE IF NOT EXISTS `policies` /*!40100 DEFAULT CHARACTER SET latin1 */;
DROP DATABASE IF EXISTS `roaming_peers`;
CREATE DATABASE IF NOT EXISTS `roaming_peers` /*!40100 DEFAULT CHARACTER SET utf8 */;
DROP DATABASE IF EXISTS `sdr_v3`;
CREATE DATABASE IF NOT EXISTS `sdr_v3` /*!40100 DEFAULT CHARACTER SET utf8 */;

restore from native ndb backup:
ndb_restore -n 11 -b 101 --restore_meta --backup_path=/opt/BACKUP/BACKUP-101 --disable-indexes --include-databases=auth,cg,phlr,policies,sdr_v3,roaming_peers
ndb_restore -n 11 -b 101 --restore_data --restore-epoch --backup_path=/opt/BACKUP/BACKUP-101 --disable-indexes --include-databases=auth,cg,phlr,policies,sdr_v3,roaming_peers
ndb_restore -n 12 -b 101 --restore_data --restore-epoch --backup_path=/opt/BACKUP/BACKUP-101 --disable-indexes --include-databases=auth,cg,phlr,policies,sdr_v3,roaming_peers

Rebuild indexes:
ndb_restore -n 12 -b 101 --rebuild-indexes --backup_path=/opt/BACKUP/BACKUP-101 --include-databases=auth,cg,phlr,policies,sdr_v3,roaming_peers
[17 May 2018 9:28] Denis Jdanov
traces from first ndb data node

Attachment: ndb1.error.tar.gz (application/x-gzip, text), 662.56 KiB.

[17 May 2018 9:29] Denis Jdanov
traces from second ndb data node

Attachment: ndb2.error.tar.gz (application/x-gzip, text), 682.30 KiB.

[21 May 2018 21:44] MySQL Verification Team
[root@localhost mysql]# bin/ndb_restore -n 11 -b 101 --restore_meta --backup_path=backup --disable-indexes --include-databases=auth,cg,phlr,policies,sdr_v3,roaming_peers
Nodeid = 11
Backup Id = 101
backup path = backup
Including Databases: auth cg phlr policies sdr_v3 roaming_peers
2018-05-21 23:36:21 [restore_metadata] Read meta data file header
Opening file 'backup/BACKUP-101.11.ctl'
File size 119808 bytes
Backup version in files: ndb-6.3.11 ndb version: mysql-5.7.21 ndb-7.5.9
2018-05-21 23:36:21 [restore_metadata] Load content
Stop GCP of Backup: 2802
2018-05-21 23:36:21 [restore_metadata] Get number of Tables
...
Successfully restored table event REPL$roaming_peers/operators_list
2018-05-21 23:36:33 [restore_metadata] Save foreign key info
Save FK 134/73/brokers_operators_FK
...
2018-05-21 23:36:33 [restore_data] Start restoring table data

NDBT_ProgramExit: 0 - OK

[root@localhost mysql]# bin/ndb_restore -n 11 -b 101 --restore_epoch --backup_path=backup --disable-indexes --include-databases=auth,cg,phlr,policies,sdr_v3,roaming_peers
Nodeid = 11
Backup Id = 101
backup path = backup
Including Databases: auth cg phlr policies sdr_v3 roaming_peers
2018-05-21 23:37:06 [restore_metadata] Read meta data file header
Opening file 'backup/BACKUP-101.11.ctl'
File size 119808 bytes
Backup version in files: ndb-6.3.11 ndb version: mysql-5.7.21 ndb-7.5.9
2018-05-21 23:37:06 [restore_metadata] Load content
Stop GCP of Backup: 2802
2018-05-21 23:37:06 [restore_metadata] Get number of Tables
2018-05-21 23:37:06 [restore_metadata] Validate Footer
Connected to ndb!!
2018-05-21 23:37:06 [restore_metadata] Restore objects (tablespaces, ..)
2018-05-21 23:37:06 [restore_metadata] Restoring tables
2018-05-21 23:37:06 [restore_metadata] Save foreign key info
Save FK 134/73/brokers_operators_FK
Save FK 134/90/cg_environments_cg_operators_FK
...
2018-05-21 23:37:06 [restore_data] Start restoring table data
2018-05-21 23:37:06 [restore_epoch] Restoring epoch

NDBT_ProgramExit: 0 - OK

[root@localhost mysql]# bin/ndb_restore -n 12 -b 101 --restore_epoch --backup_path=backup --disable-indexes --include-databases=auth,cg,phlr,policies,sdr_v3,roaming_peers
Nodeid = 12
Backup Id = 101
backup path = backup
Including Databases: auth cg phlr policies sdr_v3 roaming_peers
2018-05-21 23:37:23 [restore_metadata] Read meta data file header
Opening file 'backup/BACKUP-101.12.ctl'
File size 119808 bytes
Backup version in files: ndb-6.3.11 ndb version: mysql-5.7.21 ndb-7.5.9
...
Save FK 10/51/user_authority_authorities_id_fk
2018-05-21 23:37:24 [restore_data] Start restoring table data
2018-05-21 23:37:24 [restore_epoch] Restoring epoch

NDBT_ProgramExit: 0 - OK
[21 May 2018 22:03] MySQL Verification Team
[root@localhost mysql]# bin/ndb_restore -n 12 -b 101 --rebuild-indexes --backup_path=backup --include-databases=auth,cg,phlr,policies,sdr_v3,roaming_peers
Nodeid = 12
Backup Id = 101
backup path = backup
Including Databases: auth cg phlr policies sdr_v3 roaming_peers
2018-05-22 00:01:56 [restore_metadata] Read meta data file header
Opening file 'backup/BACKUP-101.12.ctl'
File size 119808 bytes
Backup version in files: ndb-6.3.11 ndb version: mysql-5.7.21 ndb-7.5.9
2018-05-22 00:01:56 [restore_metadata] Load content
Stop GCP of Backup: 2802
2018-05-22 00:01:56 [restore_metadata] Get number of Tables
2018-05-22 00:01:56 [restore_metadata] Validate Footer
Connected to ndb!!
2018-05-22 00:01:57 [restore_metadata] Restore objects (tablespaces, ..)
2018-05-22 00:01:57 [restore_metadata] Restoring tables
2018-05-22 00:01:57 [restore_metadata] Save foreign key info
Save FK 134/73/brokers_operators_FK
...
auth.authorities.PK child auth.user_authority.user_authority_authorities_id_fk
Create foreign keys done

NDBT_ProgramExit: 0 - OK
[21 May 2018 22:07] MySQL Verification Team
Hi Denis,

I tried this with 7.5.10, with your data, and as you can see it works flawlessly.

Either this is fixed between 7.5.10 and 7.5.9 or there is some error with your hardware. I tested this with different config variables and could not reproduce the problem with 7.5.10.

Best regards
Bogdan

p.s. simplest config used

[root@localhost mysql]# cat /etc/my.cnf
[mysqld]
datadir=/usr/local/mysql/data
ndbcluster
user=mysql
explicit_defaults_for_timestamp
[root@localhost mysql]# cat config.ini
[ndbd default]
NoOfReplicas= 2
DataDir= /usr/local/mysql/clusterdata
DataMemory = 1G
IndexMemory = 356M
MaxNoOfConcurrentOperations=500000
MaxNoOfAttributes=50000
MaxNoOfOrderedIndexes=10000

[ndb_mgmd]
Hostname= localhost
DataDir= /usr/local/mysql/clusterdata
NodeId=10

[ndbd]
HostName= localhost
NodeId=11

[ndbd]
HostName= localhost
NodeId=12

[mysqld]
NodeId=21
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[root@localhost mysql]#
[23 May 2018 11:09] Denis Jdanov
Did you try it with ndbmtd or regular ndbd?
I will upgrade my environment to 7.5.10 and retest it again.
[23 May 2018 11:35] MySQL Verification Team
Hi Denis,

with ndbmtd, your log shown you are using ndbmtd

all best
Bogdan
[24 May 2018 12:05] Denis Jdanov
Hi Bogdan,

I upgraded to 7.5.10 version.
and see exactly the same problem, during indexes restoration cluster crashing.

Maybe it connected to the number of threads.
As i see you run ndbmtd with default number of threads which is 2 by default,
And in my config it is MaxNoOfExecutionThreads=12,
could you please run the simulation with my config.ini if possible?

[ndbd default]
NoOfReplicas=2
LockPagesInMainMemory=1
ServerPort=2202
ODirect=1
DataMemory=64G
IndexMemory=16G
NoOfFragmentLogFiles=300
DataDir=/opt/ndbdata
MaxNoOfConcurrentOperations=1000000
SchedulerSpinTimer=400
SchedulerExecutionTimer=100
RealTimeScheduler=1
TimeBetweenGlobalCheckpoints=1000
TimeBetweenEpochs=200
RedoBuffer=32M
# CompressedLCP=1
CompressedBackup=1
# MaxNoOfLocalScans=64
MaxNoOfTables=10000
MaxNoOfOrderedIndexes=10000
MaxNoOfAttributes=50000000
MaxNoOfUniqueHashIndexes=10000
TransactionDeadlockDetectionTimeout=3000
MaxNoOfExecutionThreads=12
NoOfFragmentLogParts=6

regards,
Denis
[24 May 2018 14:48] MySQL Verification Team
Hi Denis,

> Maybe it connected to the number of threads.

Can you test with lower number?

> As i see you run ndbmtd with default number of threads which is 2 by default,
> And in my config it is MaxNoOfExecutionThreads=12,
> could you please run the simulation with my config.ini if possible?

I will try myself with higher number of execthreads but can't use your entire config as my test box is not that big. In the meantime please try yourself with lower number of threads maybe we can figure out together faster where the issue is. The config I sent you works as you can see, please confirm that first :)

all best
Bogdan
[27 May 2018 9:27] Denis Jdanov
Hi Bogdan,

I tested again today the restore procedure, with following config.ini

[ndbd default]
NoOfReplicas= 2
ServerPort=2202
DataDir= /opt/ndbdata
DataMemory = 1G
IndexMemory = 356M
MaxNoOfConcurrentOperations=500000
MaxNoOfAttributes=50000
MaxNoOfOrderedIndexes=10000

[ndb_mgmd]
DataDir= /var/log/ndb_cluster
hostname=10.252.136.70
NodeId=10

[ndbd]
hostname=10.252.136.71
NodeId=11

[ndbd]
hostname=10.252.136.72
NodeId=12

[mysqld]
NodeId=21
[mysqld]
[mysqld]
[mysqld]
[mysqld]

and again unsucessfully:

during indexes restore, specifically during foreign keys restoration, cluster crash all data nodes:

Failed to create foreign key groups_operator_id_group_id_FK parent sdr_v3.groups.PK child sdr_v3.sdr_sim_cards.groups_operator_id_group_id_FK: 4009: Cluster Failure

regards,
Denis

P.S. what platform/OS do you use for simulation?
[28 May 2018 16:24] MySQL Verification Team
Hi,

So with your data (same restore files I'm using) and same config file I'm using I'm able to load your restore without a problem and you are not, using the same ndb cluster version?!?!?

> P.S. what platform/OS do you use for simulation?

[root@localhost ~]# cat /etc/redhat-release
Fedora release 23 (Twenty Three)
[root@localhost ~]# uname -a
Linux localhost.localdomain 4.8.13-100.fc23.x86_64 #1 SMP Fri Dec 9 14:51:40 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost ~]#