MySQL Bugs: #46560: Using ndbmt slow mysqld requests

Bug #46560	Using ndbmt slow mysqld requests
Submitted:	5 Aug 2009 5:17	Modified:	15 Oct 2009 12:23
Reporter:	Cyril SCETBON	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	mysql-5.1-telco-7.0	OS:	Linux (debian etch)
Assigned to:	Assigned Account	CPU Architecture:	Any
Tags:	mysql-5.1.34 ndb-7.0.6, mysqld, ndb, ndbmt, slow

Description:
Hi,

After having upgraded from ndb-6.3.23 to ndb-7.0.6 we have decided to use ndbmtd in place of ndbd. We met a real issue after having loaded our datanodes. Every MySQLd wasn't able to make a stupid request, as a create database, as fast as it was before.For example, after the load on our data node has decreased to zero we tried to create a test database, just to check if the load was the source of our problems and we saw :

mysql> create database test;
Query OK, 1 row affected (1 min 18.61 sec)

FYI, We had the same issue, with show tables.

How to repeat:
n/a

Suggested fix:
rolling restart of the data nodes and using ndbd in place of ndbmtd

I don't if it's an important information, but in our architecture we have 2 data nodes, with 2 managers (one per data node). We also use mysqld on data nodes for starting bash scripts (the load is generated by these scripts)

config.ini, log files...
please give us something to work with

log and configuration files (operations were started today after 4h AM

Attachment: logNconfFiles01.7z (application/x-7z-compressed, text), 412.74 KiB.

any news about it ?

i can't unpack .7z
so it stalled waiting for me to download a .7z unpacker

Would it be possible to resend the configuration files packed with .tar.gz or .zip

Thanks for the re-upload of the files, Cyril.

Hi,

Looking at your uploaded logs. I see nothing.
But it might be an upgrade-issue. There are fixed 6.3-to-7.0-upgrade-bugs in
upcoming 7.0.
Also I'll repeat the upgrade procedure
1) upgrade each ndb_mgmd (either all at once, or one at a time)
2) upgrade each ndbd (either...)
3) upgrade each mysqld/ndbapi (either...)

Also note
1) that one can not upgrade from 6.3 to ndbmtd directly.
   One must upgrade first to ndbd, and then one can rolling restart into ndbmtd.
2) not conforming to procedure outlined above, will lead to strange problems
   (i.e we don't handle it very well)

---

Setting this to need feedback, cause i didn't find anything in logs,
and don't have any clues.

/Jonas

but we started each ndbd with --initial !
Does it mean that even if we start each node with initial there are some data upgraded from 6.3 in maybe our mysqld or somewhere in our cluster that can cause these errors ?

I don't understand your comment.
Did you do 
1) a rolling restart, starting them "--initial"
2) or shutdown all ndbd nodes, restarting them "--initial"

If you did 1) the you must follow procedure outlined.
If you did 2) then ndb_mgmd should still have be upgraded first.

Furthermore, you must make sure that *no* mysqld is upgraded to 7.0
before all datanodes have been upgraded.

---

We did 2) but I don't understand why you say the issue we have may be a matter with the upgrade as for me starting a --initial is like rebuilding the whole cluster, that's to say like a new installation and not *really* an upgrade

Hi,

The reason for me being stuck on upgrade problem was that you mentioned "upgrade" in your initial description, and that we have(had) quite a few upgrade problems (for online upgrades), sorry about this, now I think I understand.

With this new insight, I looked at your config + outfiles and have the following guess:

you use the "LockExecuteThreadToCPU=2"
when using ndbmtd, this means that all 6 threads will be bound to same CPU. Which will cause heavy contention...

So if you really have lots of cpus you can change the value to 
LockExecuteThreadToCPU=1,4,5,6,7,8

or you can try to unset those values in your config...

---

I'm not sure if this really is the problem, but i would certainly test it.

/Jonas

Thank you for the answer I'll soon test it !

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".