Bug #46560 Using ndbmt slow mysqld requests
Submitted: 5 Aug 2009 5:17 Modified: 15 Oct 2009 12:23
Reporter: Cyril SCETBON Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:mysql-5.1-telco-7.0 OS:Linux (debian etch)
Assigned to: Assigned Account CPU Architecture:Any
Tags: mysql-5.1.34 ndb-7.0.6, mysqld, ndb, ndbmt, slow

[5 Aug 2009 5:17] Cyril SCETBON
Description:
Hi,

After having upgraded from ndb-6.3.23 to ndb-7.0.6 we have decided to use ndbmtd in place of ndbd. We met a real issue after having loaded our datanodes. Every MySQLd wasn't able to make a stupid request, as a create database, as fast as it was before.For example, after the load on our data node has decreased to zero we tried to create a test database, just to check if the load was the source of our problems and we saw :

mysql> create database test;
Query OK, 1 row affected (1 min 18.61 sec)

FYI, We had the same issue, with show tables.

How to repeat:
n/a

Suggested fix:
rolling restart of the data nodes and using ndbd in place of ndbmtd
[5 Aug 2009 5:40] Cyril SCETBON
I don't if it's an important information, but in our architecture we have 2 data nodes, with 2 managers (one per data node). We also use mysqld on data nodes for starting bash scripts (the load is generated by these scripts)
[5 Aug 2009 6:37] Jonas Oreland
config.ini, log files...
please give us something to work with
[5 Aug 2009 6:53] Cyril SCETBON
log and configuration files (operations were started today after 4h AM

Attachment: logNconfFiles01.7z (application/x-7z-compressed, text), 412.74 KiB.

[10 Aug 2009 13:05] Cyril SCETBON
any news about it ?
[10 Aug 2009 13:25] Jonas Oreland
i can't unpack .7z
so it stalled waiting for me to download a .7z unpacker
[17 Aug 2009 13:34] Jørgen Austvik
Would it be possible to resend the configuration files packed with .tar.gz or .zip
[17 Aug 2009 14:12] Jørgen Austvik
Thanks for the re-upload of the files, Cyril.
[20 Aug 2009 6:08] Jonas Oreland
Hi,

Looking at your uploaded logs. I see nothing.
But it might be an upgrade-issue. There are fixed 6.3-to-7.0-upgrade-bugs in
upcoming 7.0.
Also I'll repeat the upgrade procedure
1) upgrade each ndb_mgmd (either all at once, or one at a time)
2) upgrade each ndbd (either...)
3) upgrade each mysqld/ndbapi (either...)

Also note
1) that one can not upgrade from 6.3 to ndbmtd directly.
   One must upgrade first to ndbd, and then one can rolling restart into ndbmtd.
2) not conforming to procedure outlined above, will lead to strange problems
   (i.e we don't handle it very well)

---

Setting this to need feedback, cause i didn't find anything in logs,
and don't have any clues.

/Jonas
[20 Aug 2009 7:20] Cyril SCETBON
but we started each ndbd with --initial !
Does it mean that even if we start each node with initial there are some data upgraded from 6.3 in maybe our mysqld or somewhere in our cluster that can cause these errors ?
[20 Aug 2009 7:42] Jonas Oreland
I don't understand your comment.
Did you do 
1) a rolling restart, starting them "--initial"
2) or shutdown all ndbd nodes, restarting them "--initial"

If you did 1) the you must follow procedure outlined.
If you did 2) then ndb_mgmd should still have be upgraded first.

Furthermore, you must make sure that *no* mysqld is upgraded to 7.0
before all datanodes have been upgraded.

---
[20 Aug 2009 7:59] Cyril SCETBON
We did 2) but I don't understand why you say the issue we have may be a matter with the upgrade as for me starting a --initial is like rebuilding the whole cluster, that's to say like a new installation and not *really* an upgrade
[4 Sep 2009 13:10] Jonas Oreland
Hi,

The reason for me being stuck on upgrade problem was that you mentioned "upgrade" in your initial description, and that we have(had) quite a few upgrade problems (for online upgrades), sorry about this, now I think I understand.

With this new insight, I looked at your config + outfiles and have the following guess:

you use the "LockExecuteThreadToCPU=2"
when using ndbmtd, this means that all 6 threads will be bound to same CPU. Which will cause heavy contention...

So if you really have lots of cpus you can change the value to 
LockExecuteThreadToCPU=1,4,5,6,7,8

or you can try to unset those values in your config...

---

I'm not sure if this really is the problem, but i would certainly test it.

/Jonas
[4 Sep 2009 14:44] Cyril SCETBON
Thank you for the answer I'll soon test it !
[15 Oct 2009 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".