Bug #45971 | Optimize on NDB table causes full cluster crash (All API/NDB Storage nodes). | ||
---|---|---|---|
Submitted: | 6 Jul 2009 13:49 | Modified: | 5 Aug 2009 7:50 |
Reporter: | John Sabo | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S2 (Serious) |
Version: | mysql-5.1-telco-6.3 | OS: | Linux (Gentoo 2008.0 AMD64) |
Assigned to: | Jonas Oreland | CPU Architecture: | Any |
Tags: | crash, ndbd, ndbmtd, Optimize |
[6 Jul 2009 13:49]
John Sabo
[6 Jul 2009 13:51]
John Sabo
Sample of error from one of the data nodes. Time: Monday 6 July 2009 - 04:51:01 Status: Temporary error, restart node Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug) Error: 2341 Error data: dbtup/DbtupVarAlloc.cpp Error object: DBTUP (Line: 482) 0x0000000a Program: /usr/sbin/ndbmtd Pid: 15280 Trace: /db/datadir/ndb_4_trace.log.1 /db/datadir/ndb_4_trace.log.1_t1 /db/datadi r/ndb_4_trace.log.1_t2 /db/datadir/ndb_4_trace.log.1_t3 /db/datadir/ndb_4_trace. lo
[6 Jul 2009 13:53]
John Sabo
ndb_mgmd config file
Attachment: my-mgmd.cnf (application/octet-stream, text), 3.79 KiB.
[6 Jul 2009 13:54]
John Sabo
Data node 2(of 4) error and trace logs.
Attachment: node_2_trace.tar.bz2 (application/octet-stream, text), 189.66 KiB.
[6 Jul 2009 13:54]
John Sabo
Data node 1(of 4) error and trace logs.
Attachment: node_3_trace.tar.bz2 (application/octet-stream, text), 183.15 KiB.
[6 Jul 2009 13:54]
John Sabo
Data node 3(of 4) error and trace logs.
Attachment: node_4_trace.tar.bz2 (application/octet-stream, text), 218.34 KiB.
[6 Jul 2009 13:55]
John Sabo
Data node 4(of 4) error and trace logs.
Attachment: node_5_trace.tar.bz2 (application/octet-stream, text), 186.87 KiB.
[20 Jul 2009 9:32]
Hindisvik Reykjavik
Hi, I think I've got the same problem. NDBCluster 7.0.5. Only one table : CREATE TABLE `foo` ( `idAutoOT` int(10) unsigned NOT NULL auto_increment, `numInt` varchar(20) collate latin1_general_ci NOT NULL, `codeBase` varchar(5) collate latin1_general_ci NOT NULL, `dateEnvoi` datetime default NULL, `blocXML` text collate latin1_general_ci, `etat` int(2) unsigned default '0', `pjAttendue` int(1) unsigned default '0', `dateT` datetime default NULL, `agence` varchar(15) collate latin1_general_ci default NULL, `codeETL` varchar(20) collate latin1_general_ci default '', `codeUI` varchar(3) collate latin1_general_ci default '', `otGroupe` int(1) default NULL, `pix` int(1) default '0', `dateReception` datetime default NULL, PRIMARY KEY (`idAutoOT`), KEY `cuOT` (`numInt`,`codeBase`), KEY `otcle1` (`dateT`), KEY `otcle2` (`dateT`,`etat`), KEY `o3` (`codeETL`,`dateEnvoi`), KEY `o4` (`codeUI`,`codeETL`) ) ENGINE=MyISAM AUTO_INCREMENT=1179679 DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci ; There are 100.000 records in this table. When I truncate it and launch an "Optimize table foo' I have no problem, with the records, I get a : 2009-07-20 11:16:02 [MgmSrvr] ALERT -- Node 3: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'. And all the nodes of the cluster Crash. Here is my config.ini : ################################## # CONFIGURATION DU CLUSTER MYSQL # ################################## [TCP DEFAULT] SendBufferMemory=2M ReceiveBufferMemory=2M [NDBD DEFAULT] NoOfReplicas=2 Datadir=/servers/mysql/cluster DataMemory=2560MB IndexMemory=512MB MaxNoOfExecutionThreads=4 # MySQL Cluster 7 -> multithread NoOfFragmentLogFiles=32 FragmentLogFileSize=128M Diskcheckpointspeed=10M Diskcheckpointspeedinrestart=100M TimeBetweenLocalCheckpoints=20 LockPagesInMainMemory=0 RedoBuffer=32M LogLevelStartup=15 LogLevelShutdown=15 LogLevelCheckpoint=8 LogLevelNodeRestart=15 [NDB_MGMD] HostName=xxxxxxxxxxx Id=1 DataDir=/var/lib/mysql-cluster ArbitrationRank=1 LogDestination=FILE:filename=/var/log/mgmd.log,maxsize=25000000,maxfiles=4 ############## # DATA NODES # ############## [NDBD] HostName=xxxxxxxxxxx MaxNoOfAttributes=100000 MaxNoOfConcurrentOperations=1000000 MaxNoOfLocalOperations=1000000 MaxNoOfConcurrentTransactions=20480 MaxNoOfConcurrentScans=400 MaxNoOfTables=5000 MaxNoOfTriggers=5000 MaxNoOfOrderedIndexes=5000 MaxNoOfUniqueHashIndexes=1000 Id=2 [NDBD] HostName=xxxxxxxxxxx MaxNoOfAttributes=100000 MaxNoOfConcurrentOperations=250000 MaxNoOfConcurrentTransactions=20480 MaxNoOfConcurrentScans=400 MaxNoOfTables=5000 MaxNoOfTriggers=5000 MaxNoOfUniqueHashIndexes=1000 MaxNoOfOrderedIndexes=5000 Id=3 ############# # API NODES # ############# [MYSQLD] HostName=xxxxxxxxxxx Id=5 [MYSQLD] HostName=xxxxxxxxxxx Id=6 [MYSQLD] HostName=xxxxxxxxxxx Id=4 [MYSQLD] #Node de test Id=10
[20 Jul 2009 12:23]
Hindisvik Reykjavik
With this trace : 2009-07-20 09:58:36 [ndbd] INFO -- dbtup/DbtupVarAlloc.cpp 2009-07-20 09:58:36 [ndbd] INFO -- DBTUP (Line: 482) 0x0000000e 2009-07-20 09:58:36 [ndbd] INFO -- Error handler shutting down system 2009-07-20 09:58:36 [ndbd] INFO -- Error handler shutdown completed - exiting 2009-07-20 09:58:36 [ndbd] ALERT -- Node 2: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'. I'm on Centos 5.2. Thank you
[4 Aug 2009 10:06]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/79981 3003 Jonas Oreland 2009-08-04 ndb - bug#45971 - crash during optimize table, make sure free list are searched correctly reuse get_alloc_page (with size + 1) since index might grow bug#43683 - optimize table does not return memory, try to move to page with *least* amount of free space
[4 Aug 2009 12:41]
Jon Stephens
Versions? 6.2+?
[4 Aug 2009 13:12]
Jonas Oreland
no 6.2, optimize not implement in 6.2 6.3.26, 7.0.7 (i.e next version) weird that triggers didnt do this...
[5 Aug 2009 7:50]
Jon Stephens
Documented bugfix in the NDB-6.3.26 and 7.0.7 changelogs as follows: OPTIMIZE TABLE on an NDB table could in some cases cause SQL and data nodes to crash. This issue was observed with both ndbd and ndbmtd.