Bug #50196 ndb_mgm REPORT MEMORY shows 2 Index memory status lines when using ndbmtd
Submitted: 8 Jan 2010 21:31 Modified: 5 Mar 2010 11:57
Reporter: Paxos Paxos Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-7.0 OS:Linux (2.6.29-vs2.3.0.36.14-gentoo)
Assigned to: Magnus Blåudd CPU Architecture:Any
Tags: 7.0.9, NDB_MGM, ndbmtd, report memory

[8 Jan 2010 21:31] Paxos Paxos
Description:
I noticed this change of behavior when our nagios scripts that monitor cluster memory usage started to report erroneous stats.

Our scripts were developed against 7.0.6, I believe.  At that time, only one Index usage line was reported.

ndb_mgm> 1 report memory
Node 1: Index usage is 49%(15709 8K pages of total 32032)
Node 1: Index usage is 0%(174 8K pages of total 32032)
Node 1: Data usage is 36%(59068 32K pages of total 163840)

ndb_mgm> all report memory
Node 1: Index usage is 49%(15709 8K pages of total 32032)
Node 1: Index usage is 0%(174 8K pages of total 32032)
Node 1: Data usage is 36%(59068 32K pages of total 163840)
Node 2: Index usage is 49%(15709 8K pages of total 32032)
Node 2: Index usage is 0%(174 8K pages of total 32032)
Node 2: Data usage is 36%(59068 32K pages of total 163840)
Node 3: Index usage is 48%(15388 8K pages of total 32032)
Node 3: Index usage is 0%(176 8K pages of total 32032)
Node 3: Data usage is 35%(57674 32K pages of total 163840)
Node 4: Index usage is 48%(15388 8K pages of total 32032)
Node 4: Index usage is 0%(176 8K pages of total 32032)
Node 4: Data usage is 35%(57674 32K pages of total 163840)

How to repeat:
From our cluster, do a REPORT MEMORY through ndb_mgm for any node.

We are running 4 data nodes in 2 node groups using ndbmtd with MaxNoOfExecutionThreads=4

Suggested fix:
Only report the correct Index usage statistic.
[8 Jan 2010 21:44] Paxos Paxos
Our config.ini:

[TCP DEFAULT]
#SendBufferMemory=2M
#ReceiveBufferMemory=2M
#OverloadLimit=128000

[NDB_MGMD DEFAULT]
DataDir=/var/lib/mysql-cluster

[NDB_MGMD]
Id=201
HostName=172.27.149.10
ArbitrationRank=1
# db-mgmt-01 vserver on host-db-01.phx1.good.net 

[NDB_MGMD]
Id=202
HostName=172.27.149.30
ArbitrationRank=2
# db-mgmt-02 vserver on host-db-02.phx1.good.net

[NDBD DEFAULT]
NoOfReplicas=2

MaxNoOfExecutionThreads=4

TotalSendBufferMemory=50M

DataDir=/var/lib/mysql-cluster
FileSystemPath=/store1
FileSystemPathDD=/store0

MaxNoOfConcurrentOperations = 500000
MaxNoOfConcurrentTransactions = 64000

LockPagesInMainMemory=1
ODirect=1

DataMemory=5GB
IndexMemory=500MB

FragmentLogFileSize=256M

NoOfFragmentLogFiles=12

RedoBuffer=32M

SchedulerSpinTimer=400
SchedulerExecutionTimer=100
RealTimeScheduler=1

HeartbeatIntervalDbDb=3000

CompressedBackup=1

[NDBD]
HostName=172.27.149.31
Id=1
NodeGroup=0
# db-data-01 vserver on host-db-01.phx1.good.net

[NDBD]
HostName=172.27.149.32
Id=2
NodeGroup=0
# db-data-02 vserver on host-db-02.phx1.good.net

[NDBD]
HostName=172.27.149.33
Id=3
NodeGroup=1
# db-data-03 vserver on host-db-03.phx1.good.net

[NDBD]
HostName=172.27.149.34
Id=4
NodeGroup=1
# db-data-03 vserver on host-db-04.phx1.good.net

[MYSQLD]
Id=101
HostName=172.27.149.51
# db-sql-01 vserver on host-db-01.phx1.good.net

[MYSQLD]
HostName=172.27.149.52
Id=102
# db-sql-02 vserver on host-db-02.phx1.good.net

[MYSQLD]
HostName=172.27.149.53
Id=103
# db-sql-03 vserver on host-db-03.phx1.good.net

[MYSQLD]
HostName=172.27.149.54
Id=104
# db-sql-04 vserver on host-db-04.phx1.good.net

[MYSQLD]
# slot reserved for api connects

[MYSQLD]
# slot reserved for api connects

# 4*3 = 12 extra mysqld entries needed for connection pooling

[MYSQLD]
[MYSQLD]
[MYSQLD]

[MYSQLD]
[MYSQLD]
[MYSQLD]

[MYSQLD]
[MYSQLD]
[MYSQLD]

[MYSQLD]
[MYSQLD]
[MYSQLD]
[8 Jan 2010 22:18] Paxos Paxos
DOES NOT happen when using ndbd, only ndbmtd.
[11 Jan 2010 22:10] Gustaf Thorslund
Paxos,

Thanks for your bug report. I've verified it using the current source from the development tree. Used a stripped down version of the provided config.ini running on localhost.

hillbilly% bzr version-info
revision-id: magnus.blaudd@sun.com-20100107192300-72w6dhp4m20mlva0
date: 2010-01-07 20:23:00 +0100
build-date: 2010-01-11 23:06:23 +0100
revno: 3327
branch-nick: mysql-cluster-7.0

Node 1: Index usage is 1%(12 8K pages of total 800)
Node 1: Index usage is 1%(13 8K pages of total 800)
Node 1: Data usage is 1%(25 32K pages of total 2048)
Node 2: Index usage is 1%(12 8K pages of total 800)
Node 2: Index usage is 1%(13 8K pages of total 800)
Node 2: Data usage is 1%(25 32K pages of total 2048)
Node 3: Index usage is 1%(13 8K pages of total 800)
Node 3: Index usage is 1%(12 8K pages of total 800)
Node 3: Data usage is 1%(25 32K pages of total 2048)
Node 4: Index usage is 1%(13 8K pages of total 800)
Node 4: Index usage is 1%(12 8K pages of total 800)
Node 4: Data usage is 1%(25 32K pages of total 2048)

/Gustaf
[12 Jan 2010 14:18] Martin Skold
This is expected behaviour in the multi-threaded implementation. Maybe
printout should be more clear what thread each line belongs to.
[12 Jan 2010 16:53] Jon Stephens
I have updated the documentation with a note about the multiple "Index usage" entries: http://lists.mysql.com/commits/96646

However, I'm also returning this bug to Verified status.

Reason:

Single-threaded and multi-threaded data nodes report memory usage in different order, as observed with recent 7.0 build from tree.

This is when using 2 ndbd's:

ndb_mgm> all report memory

ndb_mgm> Node 1: Data usage is 0%(4 32K pages of total 3200)
Node 1: Index usage is 0%(8 8K pages of total 12832)
Node 2: Data usage is 0%(4 32K pages of total 3200)
Node 2: Index usage is 0%(8 8K pages of total 12832)

ndb_mgm>

-----

This is when using 2 ndbmtd's (and MaxNoOfExecutionThreads = 8):

ndb_mgm> all report memory

Node 1: Index usage is 0%(6 8K pages of total 3232)
Node 1: Index usage is 0%(6 8K pages of total 3232)
Node 1: Index usage is 0%(6 8K pages of total 3232)
Node 1: Index usage is 0%(6 8K pages of total 3232)
ndb_mgm> Node 1: Data usage is 0%(16 32K pages of total 3200)
Node 2: Index usage is 0%(6 8K pages of total 3232)
Node 2: Index usage is 0%(6 8K pages of total 3232)
Node 2: Index usage is 0%(6 8K pages of total 3232)
Node 2: Index usage is 0%(6 8K pages of total 3232)
Node 2: Data usage is 0%(16 32K pages of total 3200)

(You then have to hit [ENTER] to return to the 'ndb_mgm>' prompt.)

-----

There are 3 problems/issues here that warrant attention:

1. Spurious insertion of 'ndb_mgm>' prompt in the middle of the report. This happens whether ndbd or ndbmtd is used. This is just sloppy IMO and shouldn't happen.

2. When using ndbd, Data memory usage is reported first; when using ndbmtd, index memory is reported first. There's no reason I can see for this inconsistency. Furthermore, since the behaviour has always been to show the data memory usage first, we should continue to do so whether data nodes are single-threaded or multi-threaded.

3. In addition, as Martin has already noted, we should add the thread ID in the output.
[3 Mar 2010 10:32] Magnus Blåudd
Fixed by patch for bug#31542

Pushed to 6.3.33, 7.0.14 and 7.1.2-beta
[3 Mar 2010 10:36] Magnus Blåudd
Clarification
 - there is now only _one_ line per memory type per node.
 - the report is requested synchronously so there should be no ndb_mgm> prompt in the middle.
 - the order of index/data memory is now deterministic - the data memory report is  sent before index memory from Cmvmi.
[3 Mar 2010 13:49] Jon Stephens
Documented as follows in the NDB-6.3.33 changelog:

      The ndb_mgm client sometimes inserted extra ndb_mgm> prompts 
      within the output of the REPORT MEMORYUSAGE command.

Documented as follows in the NDB-7.0.14 and 7.1.2 changelogs:

      The following issues were fixed in the ndb_mgm client REPORT MEMORYUSAGE
      command:

        · The client sometimes inserted extra ndb_mgm> prompts within 
        the output.

        · For data nodes running ndbmtd, IndexMemory was reported 
        before DataMemory.

        · Also for data nodes running ndbmtd, there were multiple 
        IndexMemory entries in the output.

Reverted note in mysql-cluster-mgm-client-commands mentioned in a previous comment.

Closed.
[4 Mar 2010 14:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/102325
[4 Mar 2010 15:14] Bugs System
Pushed into 5.1.41-ndb-6.3.33 (revid:magnus.blaudd@sun.com-20100304143116-0ckqk1jief823upu) (version source revid:magnus.blaudd@sun.com-20100304143116-0ckqk1jief823upu) (merge vers: 5.1.41-ndb-6.3.33) (pib:16)
[4 Mar 2010 15:15] Bugs System
Pushed into 5.1.41-ndb-7.0.14 (revid:magnus.blaudd@sun.com-20100304143500-s7ofn4cdyh6d7wg0) (version source revid:magnus.blaudd@sun.com-20100304143500-s7ofn4cdyh6d7wg0) (merge vers: 5.1.41-ndb-7.0.14) (pib:16)
[5 Mar 2010 11:57] Jon Stephens
No additional changelog entry required; re-closed.