Bug #47782 Node restart allocates 1 page for empty table (fragment)
Submitted: 2 Oct 10:26 Modified: 6 Oct 14:31
Reporter: Johan Andersson
Status: Closed
Category:Server: Cluster Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.2 OS:Any
Assigned to: Jonas Oreland Target Version:
Tags: many tables, node restart
Triage: Triaged: D2 (Serious) / R6 (Needs Assessment) / E6 (Needs Assessment)

[2 Oct 10:26] Johan Andersson
Description:
Two data nodes (id=3 and id=4)
Creating as many tables as possible before getting error 306 using this kind of table:

I could create 16249 tables before getting error 306.

Shutdown node 3.
Start node 3.

Node 3 fails to start with: 

Time: Friday 2 October 2009 - 10:06:04
Status: Temporary error, restart node
Message: Assertion (Internal error, programming error or missing error message, please
report a bug)
Error: 2301
Error data: ArrayPool<T>::getPtr
Error object: ArrayPool.hpp line: 514 (block: DBLQH)
Program: /usr/local/mysql//mysql/bin//ndbmtd
Pid: 26453 thr: 2
Version: mysql-5.1.37 ndb-7.0.8
Trace: /data1/mysqlcluster//ndb_3_trace.log.3 /data1/mysqlcluster//ndb_3_trace.log.3_t1
/data1/mysqlcluster//ndb_3_trace.log.3_

(ndb_error_report included).

How to repeat:
* Start a two node cluster
* I used the following [NDBD DEFAULT] :
[NDBD DEFAULT]
NoOfReplicas=2
Datadir=/data1/mysqlcluster/
FileSystemPathDD=/data1/mysqlcluster/
DataMemory=2048M
IndexMemory=1024M
LockPagesInMainMemory=0
MaxNoOfConcurrentOperations=100000
StringMemory=25
MaxNoOfTables=20000
MaxNoOfOrderedIndexes=10000
MaxNoOfUniqueHashIndexes=2500
MaxNoOfAttributes=2600960
DiskCheckpointSpeedInRestart=100M
FragmentLogFileSize=256M
InitFragmentLogFiles=FULL
NoOfFragmentLogFiles=12
RedoBuffer=32M
TimeBetweenLocalCheckpoints=20
TimeBetweenGlobalCheckpoints=1000
TimeBetweenEpochs=100
MemReportFrequency=30
BackupReportFrequency=10
LogLevelStartup=15
LogLevelShutdown=15
LogLevelCheckpoint=8
LogLevelNodeRestart=15
BackupMaxWriteSize=1M
BackupDataBufferSize=16M
BackupLogBufferSize=4M
BackupMemory=20M
TimeBetweenWatchdogCheckInitial=60000
MaxNoOfExecutionThreads=8
BatchSizePerLocalScan=512

* Create tables (i created 16249 tables) looking like this:

CREATE TABLE `tN` (
  `a0` int(11) NOT NULL DEFAULT '0',
  `a1` int(11) DEFAULT NULL,
... (in total 128 columns)
  `a127` int(11) DEFAULT NULL,
  PRIMARY KEY (`a0`) USING HASH
) ENGINE=ndbcluster DEFAULT CHARSET=latin1;

* Stop node 3
* Start node 3
--> Node 3 fails

Suggested fix:
-
[2 Oct 10:29] Johan Andersson
Uploaded trace files here:

ftp.mysql.com/pub/mysql/upload/bug47782_ndb_error_report_20091002101821.tar.bz2
[6 Oct 13:56] Jonas Oreland
when starting a node,
when sync:ing a table on starting node
1 page will be allocated on starting node

---

This caused out of memory trouble in strange situation
[6 Oct 14:00] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85871

3013 Jonas Oreland	2009-10-06
      ndb - bug#47782 - don't copy any pages for empty fragments, if starting node also
don't have any pages
[6 Oct 14:13] Bugs System
Pushed into 5.1.39-ndb-7.0.9 (revid:jonas@mysql.com-20091006120830-2hefo7tw6hbv61uc)
(version source revid:jonas@mysql.com-20091006120830-2hefo7tw6hbv61uc) (merge vers:
5.1.39-ndb-7.0.9) (pib:11)
[6 Oct 14:14] Bugs System
Pushed into 5.1.39-ndb-7.1.0 (revid:jonas@mysql.com-20091006121228-a6kwym6j622pj1gn)
(version source revid:jonas@mysql.com-20091006121228-a6kwym6j622pj1gn) (merge vers:
5.1.39-ndb-7.1.0) (pib:11)
[6 Oct 14:23] Jonas Oreland
pushed to 6.2.19, 6.3.28, 7.0.9 and 7.1
[6 Oct 14:31] Jon Stephens
Documented bugfix in the 6.2.19, 6.3.28, and 7.0.9 changelogs, as follows:

        When starting a node and synchronizing tables, memory pages were
        allocated even for empty fragments. In certain situations, this
        could lead to insufficient memory.

Closed.