Description:
mysql-cluster-gpl-7.2.7-solaris10-x86_64
Hello,
the Error 2341 is reported to be fixed in 7.2.7, but it has not... at least for me.
Had have it several time this week in 7.2.6.
After reading all information, decided to upgrade to 7.2.7 (because it was reported to be fixed there)
On Thursday, I even did a verry verry limited setup:
1 NodeGroup with one Datanode (NoOfReplicas = 1)
1 MGM Node
1 MySQL Nodes (...connected, configured about 200)
Verry plain schema (a RADIUS Server):
CREATE TABLE `sessions` (
`RadAcctId` bigint(21) unsigned NOT NULL AUTO_INCREMENT,
`AcctUniqueId` char(32) NOT NULL,
`AcctStatusType` tinyint(3) unsigned NOT NULL DEFAULT '1',
`AcctSessionId` char(32) NOT NULL,
`VolumeBucket` char(15) NOT NULL DEFAULT 'common',
`UserName` char(60) NOT NULL DEFAULT '',
`ContractInfo` char(250) NOT NULL,
`CallingStationId` bigint(20) unsigned NOT NULL DEFAULT '0',
`CalledStationId` char(30) NOT NULL DEFAULT '',
`DeviceTAC` char(8) NOT NULL DEFAULT '',
`mccmnc` char(8) NOT NULL,
`PacketSrcIpAddress` char(15) NOT NULL DEFAULT '',
`NASIPAddress` char(15) NOT NULL DEFAULT '',
`FramedIPAddress` char(15) NOT NULL DEFAULT '',
`AcctInputOctets` bigint(20) unsigned NOT NULL DEFAULT '0',
`AcctOutputOctets` bigint(20) unsigned NOT NULL DEFAULT '0',
`n_id` bigint(21) unsigned DEFAULT NULL,
`n_Volume` bigint(20) unsigned NOT NULL DEFAULT '0',
`n_Time` mediumint(8) unsigned NOT NULL,
`voldbchecked` smallint(6) unsigned NOT NULL DEFAULT '1',
`claimed_by` varchar(15) NOT NULL,
`claimed_at` int(11) unsigned NOT NULL DEFAULT '0',
`AcctStartTime` int(11) unsigned NOT NULL DEFAULT '0',
`AcctUpdateTime` int(11) unsigned NOT NULL DEFAULT '0',
`AcctStopTime` int(11) unsigned NOT NULL DEFAULT '0',
`AcctTerminateCause` char(32) NOT NULL DEFAULT '',
PRIMARY KEY (`CallingStationId`,`AcctUniqueId`),
UNIQUE KEY `RadAcctId` (`RadAcctId`),
KEY `CallingStationId` (`CallingStationId`),
KEY `voldbchecked` (`voldbchecked`),
KEY `VolumeBucket` (`VolumeBucket`),
KEY `AcctStatusType` (`AcctStatusType`),
KEY `AcctStartTime` (`AcctStartTime`),
KEY `claimed_by` (`claimed_by`)
) ENGINE=ndbcluster AUTO_INCREMENT=0 DEFAULT CHARSET=latin1;
The Load is one (1!) Insert/s plus 2-3 Updates/s, but I share Arlo Gilbert's observation for the high load situation: we loaded the fully configured cluster (10 Datanodes, 10 API Nodes with 1200 Inserts/s, about 9000 Updates/s, having 239 Million Sets after a few days, where we used up the complete configured Memory... without problems.
After a few hours with 1/s:
ndb_mgm> Node 3: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
If there is no load at all, the cluster has no problem.
As this will be the final configuration, we tried with and without logging (SET @@ndb_table_no_logging = 1;)
We are running mysql-cluster-gpl-7.1.14-solaris10-x86_64.tar.gz on our current producktion system, and wanted to upgrade to 7.2 for the comfort of having loadbalanced tracsaction coordination (seems to be single thraeded in 7.1)
7.1.14 is stable (from this point of view) even if the Cluster Team stated, that the 2341 Buck is there since 7.0 ...
Will there be a solution?
Any ideas?
To be mentioned:
- We are connecting mysqld local explicitely using the sock file
- we have no replication settings configured in mysqld ... even no replication id
Our Files:
###############################
My config.ini
[TCP DEFAULT]
SendBufferMemory=8M
ReceiveBufferMemory=8M
[NDB_MGMD DEFAULT]
PortNumber=11861
Datadir=/DB/mysql/sessions
[NDB_MGMD]
NodeId=1
Hostname=dbmgr1
LogDestination=FILE:filename=ndb_1_cluster.log,maxsize=10000000,maxfiles=6
ArbitrationRank=1
#[NDB_MGMD]
#NodeId=2
#Hostname=dbmgr2
#LogDestination=FILE:filename=ndb_2_cluster.log,maxsize=10000000,maxfiles=6
#ArbitrationRank=1
[NDBD DEFAULT]
NoOfReplicas=1
Datadir=/DB/mysql/sessions
FileSystemPathDD=/DB/mysql/sessions
BackupDataDir=/DB/mysql/backup/sessions
#FileSystemPathUndoFiles=/DB/mysql/sessions
#FileSystemPathDataFiles=/DB/mysql/sessions
DataMemory=40000M
IndexMemory=6000M
LockPagesInMainMemory=1
MaxNoOfConcurrentOperations=100000
MaxNoOfConcurrentTransactions=16384
StringMemory=25
MaxNoOfTables=4096
MaxNoOfOrderedIndexes=2048
MaxNoOfUniqueHashIndexes=512
MaxNoOfAttributes=24576
MaxNoOfTriggers=14336
### Params for REDO LOG
FragmentLogFileSize=256M
InitFragmentLogFiles=SPARSE
NoOfFragmentLogFiles=170
RedoBuffer=64M
TimeBetweenGlobalCheckpoints=1000
TimeBetweenEpochs=100
TimeBetweenEpochsTimeout=32000
### Params for LCP
DiskCheckpointSpeedInRestart=100M
DiskCheckpointSpeed=10M
TimeBetweenLocalCheckpoints=20
### Heartbeating
HeartbeatIntervalDbDb=1500
HeartbeatIntervalDbApi=1500
### Params for setting logging
MemReportFrequency=30
BackupReportFrequency=10
LogLevelStartup=15
LogLevelShutdown=15
LogLevelCheckpoint=8
LogLevelNodeRestart=15
### Params for BACKUP
BackupMaxWriteSize=1M
BackupDataBufferSize=16M
BackupLogBufferSize=4M
BackupMemory=20M
### Params for ODIRECT
#Reports indicates that odirect=1 can cause io errors (os err code 5) on some systems. You must test.
ODirect=1
### Watchdog
TimeBetweenWatchdogCheckInitial=60000
### TransactionInactiveTimeout - should be enabled in Production
TransactionInactiveTimeout=60000
### REALTIME EXTENSIONS
#RealTimeScheduler=1
### REALTIME EXTENSIONS FOR 6.3 ONLY
#SchedulerExecutionTimer=80
#SchedulerSpinTimer=40
### DISK DATA
SharedGlobalMemory=20M
DiskPageBufferMemory=64M
### Multithreading
MaxNoOfExecutionThreads=8
### Increasing the LongMessageBuffer b/c of a bug (20090903)
LongMessageBuffer=32M
BatchSizePerLocalScan=512
[NDBD]
NodeId=3
Hostname=dbsession1
#[NDBD]
#NodeId=4
#Hostname=dbsession2
#[NDBD]
#NodeId=5
#Hostname=dbsession3
#[NDBD]
#NodeId=6
#Hostname=dbsession4
#[NDBD]
#NodeId=7
#Hostname=dbsession5
#[NDBD]
#NodeId=8
#Hostname=dbsession6
#[NDBD]
#NodeId=9
#Hostname=dbsession7
#[NDBD]
#NodeId=10
#Hostname=dbsession8
#[NDBD]
#NodeId=11
#Hostname=dbsession9
#[NDBD]
#NodeId=12
#Hostname=dbsession10
[MYSQLD DEFAULT]
BatchSize=512
#BatchByteSize=2048K
#MaxScanBatchSize=2048K
[MYSQLD]
NodeId=21
Hostname=dbapife1
[MYSQLD]
NodeId=22
Hostname=dbapife1
.... until
NodeId=220
Hostname=dbmgr2
###############################
My My.cnf Files
[MYSQLD]
user=mysql
basedir=/usr/local/mysqlbin
datadir=/DB/mysql/sessions
pid-file=mysqld.pid
socket=/var/lib/mysql/mysqlsessions.sock
port=33061
ndb-cluster-connection-pool=4
ndbcluster=1
ndb-connectstring="dbmgr1:11861,dbmgr2:11861"
ndb-force-send=1
ndb-use-exact-count=0
ndb-extra-logging=1
ndb-batch-size=31536000
ndb-autoincrement-prefetch-sz=1024
engine-condition-pushdown=1
default-storage-engine=myisam
#LOGS
log-error=/var/log/acctopus/mysql/sessions.error.log
#log
slow-query-log=1
slow-query-log-file=/var/log/acctopus/mysql/sessions.slow.log
key_buffer_size = 256M
max_allowed_packet = 16M
sort_buffer_size = 512K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
#thread_cache_size=1024
myisam_sort_buffer_size = 8M
memlock=0
sysdate_is_now=1
max-connections=300
thread-cache-size=128
thread_concurrency = 16
query_cache_type = 0
query_cache_size = 0
table_open_cache=1024
lower-case-table-names=0
group_concat_max_len=10M
[MYSQL]
socket=/var/lib/mysql/mysqlsessions.sock
[client]
socket=/var/lib/mysql/mysqlsessions.sock
###############################
Will upload ndb_reporter output...
Thanks Stefan
How to repeat:
Nothing special. Start over, wait some hours ...