Description:
While running simple inserts test using NDBAtomics one of the data nodes failed with the following:
Time: Tuesday 2 December 2008 - 08:07:21
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: lgman.cpp
Error object: LGMAN (Line: 1468) 0x00000006
Program: /data0/cr_autotest/libexec/ndbd
Pid: 19788
Trace: ./ndb_2_trace.log.1
Version: mysql-5.1.30 ndb-6.3.20-GA
***EOM***
2008-12-02 08:07:22 [MgmSrvr] ALERT -- Node 1: Node 2 Disconnected
2008-12-02 08:07:22 [MgmSrvr] ALERT -- Node 3: Node 2 Disconnected
2008-12-02 08:07:22 [MgmSrvr] INFO -- Node 3: Communication to Node 2 closed
2008-12-02 08:07:22 [MgmSrvr] ALERT -- Node 3: Network partitioning - arbitration required
2008-12-02 08:07:22 [MgmSrvr] INFO -- Node 3: President restarts arbitration thread [state=7]
2008-12-02 08:07:22 [MgmSrvr] ALERT -- Node 3: Arbitration won - positive reply from node 1
2008-12-02 08:07:22 [MgmSrvr] INFO -- Node 3: GCP Take over started
2008-12-02 08:07:22 [MgmSrvr] INFO -- Node 3: GCP Take over completed
2008-12-02 08:07:22 [MgmSrvr] INFO -- Node 3: kk: 3214/61 2 0
2008-12-02 08:07:22 [MgmSrvr] ALERT -- Node 3: Node 5 Disconnected
2008-12-02 08:07:22 [MgmSrvr] INFO -- Node 3: Communication to Node 5 closed
2008-12-02 08:07:22 [MgmSrvr] INFO -- Mgmt server state: nodeid 5 freed, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000000012.
2008-12-02 08:07:22 [MgmSrvr] ALERT -- Node 2: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
2008-12-02 08:07:22 [MgmSrvr] INFO -- Node 3: Started arbitrator node 1 [ticket=75200002db419c05]
Note: 900 insert operations per transactions were being passed
How to repeat:
I have not repeated it just yet
ACRT
in one terminal
/space/cluster_rep_auto>sh -x scripts/boot.sh --clone=mysql-5.1-telco-6.3 --CONF=/space/cluster_rep_auto/cr-autotest.conf --start-and-exit 2-dn
in another terminal
/space/cluster_rep_auto>sh -x drivers/ndbatomics-dd-tester.sh ./cr-autotest.conf
[atrt]
basedir=CHOOSE_dir
baseport=15000
clusters= .master
[ndb_mgmd]
[mysqld]
skip-grant-tables
skip-innodb
ndb_use_exact_count=0
loose-join_cache_level=6
[cluster_config]
MaxNoOfSavedMessages = 1000
[cluster_config.master]
NoOfReplicas = 2
DataMemory = 4000M
IndexMemory = 400M
RedoBuffer=200M
NoOfFragmentLogFiles=10
FragmentLogFileSize=256M
MaxNoOfConcurrentOperations = 250000
MaxNoOfLocalOperations = 275000
MaxNoOfConcurrentIndexOperations = 20000
MaxNoOfAttributes=2048
MaxNoOfOrderedIndexes=512
MaxNoOfUniqueHashIndexes=512
DiskPageBufferMemory=1048MB
LockPagesInMainMemory=1
DiskCheckpointSpeed=16M
ndb_mgmd = CHOOSE_host2
ndbd = CHOOSE_host2,CHOOSE_host3
mysqld = CHOOSE_host1
ndbapi= CHOOSE_host1,CHOOSE_host1
[cluster_config.ndbd.1.master]
FileSystemPath=/data1/
[cluster_config.ndbd.2.master]
FileSystemPath=/data1/
CREATE LOGFILE GROUP $our_lfg_name
ADD UNDOFILE 'undofile.dat'
INITIAL_SIZE 2000M
UNDO_BUFFER_SIZE = 4M
ENGINE=NDB;
ALTER LOGFILE GROUP $our_lfg_name
ADD UNDOFILE '$file'
INITIAL_SIZE 2500M
ENGINE=NDB;
CREATE TABLESPACE $our_ts_name
ADD DATAFILE 'datafile.dat'
USE LOGFILE GROUP $our_lfg_name
INITIAL_SIZE 50M
ENGINE=NDB;
+ 39 of
"ALTER TABLESPACE $our_ts_name
ADD DATAFILE '$file'
INITIAL_SIZE 50M
ENGINE=NDB;