MySQL Bugs: #55611: GCP stop was detected NDBCNTR (Line: 249) Solaris10

Bug #55611	GCP stop was detected NDBCNTR (Line: 249) Solaris10
Submitted:	28 Jul 2010 16:34	Modified:	26 Sep 2010 14:16
Reporter:	Ciccio R	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	mysql-5.1-telco-6.3	OS:	Solaris
Assigned to:	Assigned Account	CPU Architecture:	Any
Tags:	cluster, GCP, mysql-5.1.30 ndb-6.3.20-GA, ndb

Description:
I have the one cluster in one Solaris10 server.

Following there is the config.ini

[tcp default]
SendBufferMemory=2M
ReceiveBufferMemory=2M

[ndbd default]
NoOfReplicas= 1
DataDir= /var/mysqlcluster/ndb
HeartbeatIntervalDbDb=5000
HeartbeatIntervalDbApi=5000
DataMemory=3500M
IndexMemory=500M
DiskPageBufferMemory=1G
TransactionBufferMemory=2G
MaxNoOfConcurrentOperations=1000000
TimeBetweenGlobalCheckpoints=5000
DiskCheckpointSpeed=10M
DiskCheckpointSpeedInRestart=100M
NoOfFragmentLogFiles=100
TransactionDeadlockDetectionTimeout=5000
MaxNoOfTables=1024

[ndb_mgmd]
Hostname= 10.20.67.215
datadir=/var/mysqlcluster/mgmd/
ArbitrationRank=1
[ndbd]
Hostname= 10.20.67.215
[mysqld]
Hostname= 10.20.67.215

In my table I have the following number of row

mysql> select Result, count(*) from Clients group by Result;
+-----------------+----------+
| Result | count(*) |
+-----------------+----------+
|               1 |    61010 |
|               2 |     5317 |
|               5 |   236990 |
+-----------------+----------+
3 rows in set (12.20 sec)

If I run 

mysql> update Clients set Result = 10 where Result = 5;

I get the error:

ERROR 1297 (HY000): Got temporary error 4010 'Node failure caused abort of transaction' from NDBCLUSTER

I noticed that if I run the same update for a number of rows less then 100K the update is fine, so seems is the number of rows involved on the update which cause the error

Following there is the error reported by ndb

Time: Wednesday 28 July 2010 - 16:12:52
Status: Temporary error, restart node
Message: System error, node killed during node restart by other node (Internal error, programming error or missing err
or message, please report a bug)
Error: 2303
Error data: Node 2 killed this node because GCP stop was detected
Error object: NDBCNTR (Line: 249) 0x0000000a
Program: ndbd
Pid: 15526
Trace: /home/mysqlcluster/ndb/ndb_2_trace.log.7
Version: mysql-5.1.30 ndb-6.3.20-GA
***EOM***

How to repeat:

just load a table with a lots of rows ( more then 100K ) and try to upload more then 100K row using one UPDATE command.

Global checkpoint stop errors can be caused by two different situations:

* disk subsystem too slow so that writing a global checkpoint takes too long
* large transaction touching many rows takes too long to commit, thus preventing the next global checkpoint from starting too long.

This looks like the 2nd case, but to be sure we'd need to see the ndb_2_out.log file.

Modifying that many rows in a single transaction is not a good idea with MySQL Cluster anyway, see also the last item on http://dev.mysql.com/doc/refman/5.0/en/mysql-cluster-limitations-transactions.html

You should run your update as

  repeat
    UPDATE Clients SET Result = 10 WHERE Result = 5 LIMIT 10000;
  until no more rows affected;

Thanks for the usefull  information.

I attached the file you required. 

Regards
Francesco

File required

Attachment: ndb_2_out.log (application/octet-stream, text), 58.18 KiB.

This could quite likely be http://bugs.mysql.com/bug.php?id=51512
Can you please test a version in which bug has been solved ?

Setting status to "Need feedback"

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".