Bug #51086 last deadlock detected not showing a valid reason for the deadlock
Submitted: 11 Feb 2010 11:11 Modified: 11 Feb 2010 16:49
Reporter: walid bakkar Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Locking Severity:S2 (Serious)
Version: 5.0.51a OS:Linux
Assigned to: CPU Architecture:Any

[11 Feb 2010 11:11] walid bakkar
Description:
(don't know if this happens on other versions/OS)
consider the following table model:
Table1: docid, rootkey, language
Table2: docid, extra fields

transaction 1 is doing:
DELETE FROM FIELD   WHERE docid IN ( SELECT docid FROM DOCUMENT WHERE ( rootkey = '326E8341A9F8DFA7769590A7D35276E4' AND language = 'en' ) )

transaction 2 is doing:
select (1) from DOCUMENT  where rootkey = 'C01D277153FC09FD55E3AAC423463145' and language = 'en' for update

the keys are different, the docids are different, affected rows are not common.

result is, latest deadlock shows transaction 1 not holding any locks, but waiting to acquire a lock

transaction 2 is holding a lock and waiting to acquire another one.

the fact that transaction 1 is holding no locks should make it clear that this isn't a deadlock and transaction 2 should proceed, before transaction 1 can continue.

and given that i'm accessing different rows from each transaction, why is transaction 1 trying to acquire a lock that is not related to it?
(the lock held by transaction 1 is clearly transaction 2 related, notice the key: C01D277153FC09FD55E3AAC4234631)

here is a full listing of the deadlock info:
LATEST DETECTED DEADLOCK
------------------------
100211 11:31:39
*** (1) TRANSACTION:
TRANSACTION 0 8727707, ACTIVE 0 sec, process no 6055, OS thread id 1185995088 starting index read
mysql tables in use 2, locked 2
LOCK WAIT 1388 lock struct(s), heap size 129008, undo log entries 9
MySQL thread id 201151, query id 84322140 test-app.elementn.berytech.org 10.0.0.67 autoforms preparing
DELETE FROM FIELD   WHERE docid IN ( SELECT docid FROM DOCUMENT WHERE ( rootkey = '326E8341A9F8DFA7769590A7D35276E4' AND language = 'en' ) order by docid asc )
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 7425775 n bits 168 index `PRIMARY` of table `autoforms/document` trx id 0 8727707 lock_mode X locks rec but not gap waiting
Record lock, heap no 99 PHYSICAL RECORD: n_fields 19; compact format; info bits 0
 0: len 4; hex 00002540; asc   %@;; 1: len 6; hex 000000852c8d; asc     , ;; 2: len 7; hex 00000040061030; asc    @  0;; 3: len 30; hex 433031443237373135334643303946443535453341414334323334363331; asc C01D277153FC09FD55E3AAC4234631;...(truncated); 4: len 4; hex 80000001; asc     ;; 5: len 4; hex 80000001; asc     ;; 6: len 2; hex 6172; asc ar;; 7: len 4; hex 80000001; asc     ;; 8: len 8; hex e30c67d6eaa51884; asc   g     ;; 9: len 5; hex 61646d696e; asc admin;; 10: len 8; hex 80000126bc5e8df3; asc    & ^  ;; 11: len 15; hex 70656e64696e67417070726f76616c; asc pendingApproval;; 12: len 5; hex 61646d696e; asc admin;; 13: len 8; hex 80000126bc5e8df3; asc    & ^  ;; 14: len 4; hex 00000001; asc     ;; 15: len 4; hex 00000003; asc     ;; 16: len 8; hex 8000000000000000; asc         ;; 17: len 8; hex 8000000000000000; asc         ;; 18: len 0; hex ; asc ;;

*** (2) TRANSACTION:
TRANSACTION 0 8727708, ACTIVE 0 sec, process no 6055, OS thread id 1185728848 starting index read, thread declared inside InnoDB 500
mysql tables in use 1, locked 1
6 lock struct(s), heap size 1216, undo log entries 1
MySQL thread id 201147, query id 84322165 test-app.elementn.berytech.org 10.0.0.67 autoforms Sending data
select (1) from DOCUMENT  where rootkey = 'C01D277153FC09FD55E3AAC423463145' and language = 'en' for update
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 0 page no 7425775 n bits 168 index `PRIMARY` of table `autoforms/document` trx id 0 8727708 lock_mode X locks rec but not gap
Record lock, heap no 99 PHYSICAL RECORD: n_fields 19; compact format; info bits 0
 0: len 4; hex 00002540; asc   %@;; 1: len 6; hex 000000852c8d; asc     , ;; 2: len 7; hex 00000040061030; asc    @  0;; 3: len 30; hex 433031443237373135334643303946443535453341414334323334363331; asc C01D277153FC09FD55E3AAC4234631;...(truncated); 4: len 4; hex 80000001; asc     ;; 5: len 4; hex 80000001; asc     ;; 6: len 2; hex 6172; asc ar;; 7: len 4; hex 80000001; asc     ;; 8: len 8; hex e30c67d6eaa51884; asc   g     ;; 9: len 5; hex 61646d696e; asc admin;; 10: len 8; hex 80000126bc5e8df3; asc    & ^  ;; 11: len 15; hex 70656e64696e67417070726f76616c; asc pendingApproval;; 12: len 5; hex 61646d696e; asc admin;; 13: len 8; hex 80000126bc5e8df3; asc    & ^  ;; 14: len 4; hex 00000001; asc     ;; 15: len 4; hex 00000003; asc     ;; 16: len 8; hex 8000000000000000; asc         ;; 17: len 8; hex 8000000000000000; asc         ;; 18: len 0; hex ; asc ;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 7425775 n bits 168 index `PRIMARY` of table `autoforms/document` trx id 0 8727708 lock_mode X locks rec but not gap waiting
Record lock, heap no 98 PHYSICAL RECORD: n_fields 19; compact format; info bits 0
 0: len 4; hex 0000253f; asc   %?;; 1: len 6; hex 000000852c8d; asc     , ;; 2: len 7; hex 00000040061052; asc    @  R;; 3: len 30; hex 433031443237373135334643303946443535453341414334323334363331; asc C01D277153FC09FD55E3AAC4234631;...(truncated); 4: len 4; hex 80000001; asc     ;; 5: len 4; hex 80000001; asc     ;; 6: len 2; hex 656e; asc en;; 7: len 4; hex 80000001; asc     ;; 8: len 8; hex e30c67d6eaa51884; asc   g     ;; 9: len 5; hex 61646d696e; asc admin;; 10: len 8; hex 80000126bc5e8a8d; asc    & ^  ;; 11: len 15; hex 70656e64696e67417070726f76616c; asc pendingApproval;; 12: len 5; hex 61646d696e; asc admin;; 13: len 8; hex 80000126bc5e8a8d; asc    & ^  ;; 14: len 4; hex 00000001; asc     ;; 15: len 4; hex 00000003; asc     ;; 16: len 8; hex 8000000000000000; asc         ;; 17: len 8; hex 8000000000000000; asc         ;; 18: len 0; hex ; asc ;;

How to repeat:
document | CREATE TABLE `document` (
  `docid` int(10) unsigned NOT NULL auto_increment,
  `rootkey` varchar(32) NOT NULL,
  `latest` int(1) default NULL,
  `published` int(1) default NULL,
  `language` char(2) NOT NULL,
  `version_number` int(2) NOT NULL,
  `form_id` bigint(20) NOT NULL,
  `creator` varchar(128) NOT NULL,
  `creation_date` bigint(20) default NULL,
  `workflow_state` varchar(128) default NULL,
  `last_modified_by` varchar(128) default NULL,
  `last_modified_date` bigint(20) default NULL,
  `scn` int(10) unsigned default '1',
  `sharedSCN` int(10) unsigned default '1',
  `timer_date` bigint(20) default NULL,
  `timer_processed` bigint(20) default NULL,
  `timer_step` varchar(128) default NULL,
  PRIMARY KEY  (`docid`),
  UNIQUE KEY `rootkey` (`rootkey`,`language`,`version_number`),
  KEY `language_index` (`language`),
  KEY `latestkey_index` (`latest`),
  KEY `publishedkey_index` (`published`),
  KEY `form_index` (`form_id`),
  KEY `creator_index` (`creator`),
  KEY `creation_date_index` (`creation_date`),
  KEY `dockey_versionnumber_unique` (`rootkey`,`version_number`,`language`)
) ENGINE=InnoDB AUTO_INCREMENT=9559 DEFAULT CHARSET=utf8 |

FIELD | CREATE TABLE `field` (
  `docid` int(10) unsigned NOT NULL,
  `field_name` varchar(128) NOT NULL,
  `field_position` int(10) unsigned NOT NULL,
  `field_type` int(1) unsigned NOT NULL,
  `field_string` varchar(1024) default NULL,
  `field_numeric` double default NULL,
  `field_date` bigint(20) default NULL,
  `field_text` text,
  KEY `docid` (`docid`,`field_name`,`field_position`),
  KEY `docid_index` (`docid`),
  KEY `field_name_index` (`field_name`),
  KEY `field_type_index` (`field_type`),
  KEY `field_string_index` (`field_string`(255)),
  KEY `field_numeric_index` (`field_numeric`),
  KEY `field_date_index` (`field_date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

in my configuration, i have:
innodb_locks_unsafe_for_binlog=1
[11 Feb 2010 12:49] Valeriy Kravchuk
Thank you for the problem report. Do you have any triggers defined on the FIELD table?
[11 Feb 2010 13:03] walid bakkar
no, no triggers defined on FIELD and no triggers defined on DOCUMENT
[11 Feb 2010 13:21] Valeriy Kravchuk
Please, send the results of:

explain SELECT * FROM FIELD   WHERE docid IN ( SELECT docid FROM DOCUMENT WHERE ( rootkey =
'326E8341A9F8DFA7769590A7D35276E4' AND language = 'en' ) order by docid asc );

Note also this (http://dev.mysql.com/doc/refman/5.0/en/innodb-locks-set.html):

"If a secondary index is used in a search and index record locks to be set are exclusive, InnoDB also retrieves the corresponding clustered index records and sets locks on them."

Looks like this is the reason why X locks are set on the primary key of the DOCUMENT table that is used only in subquery of your DELETE in the first transaction.
[11 Feb 2010 13:27] walid bakkar
explain SELECT * FROM FIELD   WHERE docid IN ( SELECT docid FROM DOCUMENT WHERE ( rootkey = '326E8341A9F8DFA7769590A7D35276E4' AND language = 'en' ) order by docid asc );
+----+--------------------+----------+-----------------+------------------------------------------------------------+---------+---------+------+--------+-------------+
| id | select_type        | table    | type            | possible_keys                                              | key     | key_len | ref  | rows   | Extra       |
+----+--------------------+----------+-----------------+------------------------------------------------------------+---------+---------+------+--------+-------------+
|  1 | PRIMARY            | FIELD    | ALL             | NULL                                                       | NULL    | NULL    | NULL | 110035 | Using where | 
|  2 | DEPENDENT SUBQUERY | DOCUMENT | unique_subquery | PRIMARY,rootkey,language_index,dockey_versionnumber_unique | PRIMARY | 4       | func |      1 | Using where | 
+----+--------------------+----------+-----------------+------------------------------------------------------------+---------+---------+------+--------+-------------+
[11 Feb 2010 13:29] walid bakkar
(while i read the rest of your comment :) )
[11 Feb 2010 13:40] walid bakkar
plan already looks like it isn't doing the best query for this deletion, as it is starting by doing a:
+----+-------------+----------+------+----------------------------------------------------+---------+---------+-------------+------+------------------------------------------+
| id | select_type | table    | type | possible_keys                                      | key     | key_len | ref         | rows | Extra                                    |
+----+-------------+----------+------+----------------------------------------------------+---------+---------+-------------+------+------------------------------------------+
|  1 | SIMPLE      | DOCUMENT | ref  | rootkey,language_index,dockey_versionnumber_unique | rootkey | 104     | const,const |    1 | Using where; Using index; Using filesort | 
+----+-------------+----------+------+----------------------------------------------------+---------+---------+-------------+------+------------------------------------------+

on the inner query, and them applying the FIELD IN on the small result it will get back.

as for: 
"If a secondary index is used in a search and index record locks to be
set are exclusive, InnoDB also retrieves the corresponding clustered
index records and sets locks on them."

i'm not sure this is the case, as none of the records in the FIELD table should match both rows on the DOCUMENT table 
the docid in the document is unique, the field rows are identified by the docid.
so it sounds more like a transaction is setting locks on rows it might have encountered but not on the "corresponding" records only.
[11 Feb 2010 13:52] Valeriy Kravchuk
So, actually for DELETE server reads entire FIELD table and then search in the other table by PRIMARY key, docid, just to find (in all cases but one) that it does NOT correspond. Surely it sets too many X locks and can deadlock easily with that other SELECT ... FOR UPDATE, that access rows by unique index, but has to set X locks on PRIMARY key also.

Note this (from the same manual page I had quoted):

"A locking read, an UPDATE, or a DELETE generally set record locks on every index record that is scanned in the processing of the SQL statement. It does not matter whether there are WHERE conditions in the statement that would exclude the row."

To summarize: the reason for this deadlock is clear.
[11 Feb 2010 16:49] walid bakkar
thanks for the quick feedback on this.
so the problem now is that the deadlock is explainable, only because the wrong plan is picked for the execution of this query.
as the plan shows, the field table is going to be scanned, and the subquery will be executed for every row, instead of executing the subquery first, and then deleting, using the docid index.