Bug #26449 | Deadlock with ndbcluster engine and subqueries | ||
---|---|---|---|
Submitted: | 16 Feb 2007 15:49 | Modified: | 22 Jun 2007 15:16 |
Reporter: | Matteo Brusa | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S1 (Critical) |
Version: | 5.0.33 | OS: | Linux (Debian etch (testing)) |
Assigned to: | CPU Architecture: | Any | |
Tags: | cluster, deadlock |
[16 Feb 2007 15:49]
Matteo Brusa
[16 Feb 2007 18:00]
Hartmut Holzgraefe
Yes, please provide an example, we won't be able to handle this bug without one
[19 Feb 2007 13:28]
Matteo Brusa
I created a set of data which causes the system to crash in the crash8.sql file. To replicate the crash, execute in 2 shells the following query: while echo "select *, (select count(*) from tasksqueue as tq where tq.jobs_id=jobs.id and status=0) as queued from jobs left join tasksqueue on tasksqueue.jobs_id=jobs.id" | mysql test ; do true;done I tried to remove some more uninteresting fields from table "tasksqueue" but as soon as i remove some the problem disappears. Weird. As soon as the system hangs, "mysqladmin processlist" shows the "Time" field of the 2 queries increasing. I had to kill the mysqld processes with -9, they don't respond to normal kill signal.
[19 Feb 2007 13:28]
Matteo Brusa
set of data to replicate the problem
Attachment: crash8.sql (application/octet-stream, text), 3.40 KiB.
[19 Feb 2007 23:05]
Hartmut Holzgraefe
I ran the test case with all nodes on the same machine for about an hour without problems, will now try in a distributed setup ...
[20 Feb 2007 0:19]
Hartmut Holzgraefe
I haven't been able to reproduce this in a 4 machine setup either, i was running: - the management host on machine 1 - one data node each on machine 2 and 3 - mysqld and two shells running the sample loop on machine 3 all machines have 2 dual core CPUs so even with mysqld and 2 mysql clients all running on one machine there should still be full parallelism of these processes. Could you explain your cluster setup in more detail and provide your config.ini so that we might try to reproduce it more closely?
[20 Feb 2007 9:08]
Matteo Brusa
Hardware: Node 1: Dual processor Intel(R) Xeon(TM) CPU 2.80GHz family 15 model 4 stepping 3 with hyperthreading, 1Gb memory Node 2: Single processor Intel(R) Xeon(R) CPU 5110 @ 1.60GHz family 6 model 15 stepping 6 with hyperthreading, 2Gb memory Node 3: Dual processor Intel(R) Xeon(TM) MP CPU 2.00GHz family 15 model 2 stepping 5 with hyperthreading, 2 Gb memory All nodes are running debian testing (etch). On node 2 mysql is installed from debian package mysql-server-5.0 (5.0.32-3) On node 3 mysql is installed from sources (5.0.33), compiled as: CFLAGS="-O3" CXX=gcc CXXFLAGS="-O3 -felide-constructors \ -fno-exceptions -fno-rtti" ./configure \ --prefix=/usr/local/mysql --enable-assembler \ --with-mysqld-ldflags=-all-static --with-ndbcluster As you see from config.ini, the two machine are also connected with a crossed cable. Let me know if you need any more info or clarification.
[20 Feb 2007 9:08]
Matteo Brusa
config.ini
Attachment: config.ini (application/octet-stream, text), 660 bytes.
[21 Feb 2007 12:19]
Matteo Brusa
To avoid any possible problem, i installed on node 2 the same mysql as on node 3, 5.0.33 from sources. The problem persists. If there's any debug output i could provide, or compilation flag i could test, please let me know.
[15 May 2007 5:18]
Tomas Ulin
Matteo, we're gessing this might be blob related. Can you try changing Blob/Text columns to Varbinary/Varchar and see if you still see the problem. BR, Tomas
[18 May 2007 8:06]
Matteo Brusa
Hi, I changed the queries time ago to avoid all those joins. Now the system is in production phase, therefore i cannot possibly test the benefits of your suggestion. Thanks anyway.