Bug #42069 Foreign keys: hang if two connections
Submitted: 12 Jan 2009 23:04 Modified: 24 Feb 2009 0:40
Reporter: Peter Gulutzan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: General Severity:S3 (Non-critical)
Version:6.1.0-alpha-debug OS:Linux (SUSE 10.0 / 32-bit)
Assigned to: Dmitry Lenev CPU Architecture:Any

[12 Jan 2009 23:04] Peter Gulutzan
Description:
I'm using mysql-6.1-fk.
I start the server with mysqld --foreign-key-all-engines=1.

I start two client connections.
I create a primary-key table and foreign-key table.
I do some inserts and updates and commits.
I soon see that both connections hang.

Always, with SHOW PROCESSLIST, I see that one
connection state is 'Updating' + UPDATE, the
other is 'Table lock' + INSERT.

How to repeat:
Start up three connections. Call them t1 and t2 and t3.

set @@autocommit=0;
set global falcon_lock_wait_timeout=10000;
set global table_lock_wait_timeout=10000;
drop table t2,t1;
drop procedure p;
create table t1 (s1 int primary key) engine=falcon;
create table t2 (s1 int primary key references t1 (s1) on update cascade) engine=falcon;
delimiter //
create procedure p ()
begin
  declare v int default 0;
  while v < 100000 do
    select v;
    set @v2 = rand() * 100000;
    begin
      declare continue handler for sqlexception begin end;
      insert into t1 values (@v2);
      insert into t2 values (@v2);
      end;
    begin
      declare continue handler for sqlexception begin end;
      update t2 set s1 = s1 - 1;
      end;
    if v mod 11 then
      commit;
      end if;
    set v = v + 1;
    end while;
  end//
delimiter ;
call p();

On t2, say:
set @@autocommit=1;
call p();

Wait till t1 hangs and t2 hangs.
Usually this happens in less than two minutes.

On t3, say:
show processlist;

A typical example of output from t3 is:
mysql> show processlist;
+----+------+-----------+------+---------+------+------------+-----------------------------+
| Id | User | Host      | db   | Command | Time | State      | Info                        |
+----+------+-----------+------+---------+------+------------+-----------------------------+
|  1 | root | localhost | test | Query   |  553 | Updating   | update t2 set s1 = s1 - 1   |
|  2 | root | localhost | test | Query   |  553 | Table lock | insert into t1 values (@v2) |
|  3 | root | localhost | NULL | Query   |    0 | NULL       | show processlist            |
+----+------+-----------+------+---------+------+------------+-----------------------------+
3 rows in set (0.00 sec)

Consistently I see that one connection says
State = 'Updating', Info = 'update t2 ...';
while the other connection says
State = 'Table lock', Info = 'insert into t1 ...';
[12 Jan 2009 23:37] MySQL Verification Team
Thank you for the bug report. Verified as descrbed:

t3>show processlist\G
*************************** 1. row ***************************
     Id: 5
   User: root
   Host: localhost:1161
     db: d1
Command: Query
   Time: 18
  State: Table lock
   Info: insert into t1 values (@v2)
*************************** 2. row ***************************
     Id: 7
   User: root
   Host: localhost:1163
     db: d1
Command: Query
   Time: 18
  State: Updating
   Info: update t2 set s1 = s1 - 1
*************************** 3. row ***************************
     Id: 8
   User: root
   Host: localhost:1164
     db: d1
Command: Query
   Time: 0
  State: NULL
   Info: show processlist
3 rows in set (0.00 sec)

t3>
[14 Jan 2009 14:32] Dmitry Lenev
Hello Peter!

After analyzing this bug-report I think that this problem is not specific to the new foreign keys. It is perfectly repeatable in --foreign-key-all-engines=0 mode. Here is the script for mysqltest which demonstrates the same problem (can be confirmed by inspecting stacktraces in both cases):

--source include/have_falcon.inc
--source include/have_log_bin.inc

set storage_engine= Falcon;
let $engine_type= Falcon;

set @@autocommit=0;
set global falcon_lock_wait_timeout=10000;
set global table_lock_wait_timeout=10000;
create table t1 (s1 int primary key);
create table t2 (s1 int primary key);
insert into t1 values (1);
insert into t2 values (1);
commit;
begin;
update t1 set s1= s1 + 1;

connect (addconroot1, localhost, root,,);
--send insert into t1 values (1) on duplicate key update s1= 10 + (select count(*) from t2);

connection default;
insert into t2 values (2);
commit;

connection addconroot1;
--reap

-------

IMO problem which we observe here stems from the same fact as bug #41688 i.e.
the fact that Falcon engine doesn't downgrade TL_READ_NO_INSERT locks even
though it downgrades TL_WRITE locks...
[19 Jan 2009 8:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/63523

2694 Dmitry Lenev	2009-01-19
      Tentative fix for:
       Bug #41688 "Foreign keys: lock warnings".
       Bug #42069 "Foreign keys: hang if two connections".
       Bug #42147 "Concurrent DML and LOCK TABLE ... READ for
                   InnoDB table cause warnings in errlog".
      
       For Falcon table concurrent execution of statement which in
       general case should acquire TL_READ_NO_INSERT lock on the table
       (e.g. multi-update, DML with subqueries, or statements involving
       new foreign key checks) and statement modifying this table might
       have led to warnings in error log (bug #41688) and even to
       deadlocks (when several tables and transactions were involved - 
       bug #42069).
      
       Also one might have got these warnings when tried to execute 
       LOCK TABLES ... READ concurrently with any DML statement changing 
       table mentioned in it for any engine which downgrades write
       locks, e.g. InnoDB (bug #42147).
      
       All these problems were caused either by one of or combination
       of the following two factors:
       - First, Falcon didn't downgrade TL_READ_NO_INSERT locks even 
         although it was downgrading TL_WRITE to TL_WRITE_ALLOW_WRITE 
         for one of tables in the same statement. This created 
         possibility of deadlock between Falcon row-level and server
         table-level locks (as in this case it was possible for some
         statement to be waiting on some row-level lock while holding
         table-level lock which prevented owner of the former from 
         proceeding). This also made statements using Falcon tables
         especially susceptible to the second factor.
       - Second, our table-level locking subsystem were erroneously
         allowing to acquire TL_READ_NO_INSERT lock on the table
         even if there was another thread having TL_WRITE_ALLOW_WRITE
         on it. So in debug build this led to warning popping up
         in error log. It probably didn't cause more severe consequences 
         since engines who downgrade write locks are likely to implement 
         its own additional support for table locks in order to ensure
         that they also take into account all row-locks acquired by
         transactions (e.g. see InnoDB).
      
       This fix deals with the first issue by changing Falcon in such way 
       that it downgrades locks of type TL_READ_NO_INSERT to TL_READ
       (unless LOCK TABLES is being executed). This is safe since Falcon
       as any row-locker/MVCC engine has its own mechanism for handling
       concurrent changes. It should also increase concurrency in certain
       scenarios.
      
       The second problem is handled by fixing thr_lock.c code to disallow
       granting TL_READ_NO_INSERT if there is active TL_WRITE_ALLOW_WRITE
       lock.
      
       Questions for reviewers are marked by QQ.
      
       QQ: I think it makes sense to push this fix into 6.0.
           But what about earlier versions of server ?
[5 Feb 2009 8:55] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65297

2719 Dmitry Lenev	2009-02-05
      Fix for bug #41688 "Foreign keys: lock warnings" and bug #42069
      "Foreign keys: hang if two connections".
      
      For Falcon table concurrent execution of statement which in       
      general case should acquire TL_READ_NO_INSERT lock on the table
      (e.g. multi-update, DML with subqueries, or statements involving    
      new foreign key checks) and statement modifying this table might       
      have led to warnings in error log (bug #41688) and even to  
      deadlocks (when several tables and transactions were involved -
      bug #42069).
      
      Both these problems were caused either by combination of the
      following two factors:
      - First, Falcon didn't downgrade TL_READ_NO_INSERT locks even
        although it was downgrading TL_WRITE to TL_WRITE_ALLOW_WRITE
        for one of tables in the same statement. This created
        possibility of deadlock between Falcon row-level and server
        table-level locks (as in this case it was possible for some
        statement to be waiting on some row-level lock while holding
        table-level lock which prevented owner of the former from
        proceeding). This also made statements using Falcon tables
        especially susceptible to the second factor.
      - Second, our table-level locking subsystem were erroneously
        allowing to acquire TL_READ_NO_INSERT lock on the table
        even if there was another thread having TL_WRITE_ALLOW_WRITE
        on it (while assuming that TL_READ_NO_INSERT is incompatible
        with TL_WRITE_ALLOW_WRITE). So in debug build this led to
        warning popping up in error log. It probably didn't cause
        more severe consequences since engines who downgrade write
        locks are likely to implement its own additional support for
        table locks in order to ensure that they also take into account
        all row-locks acquired by transactions (e.g. see InnoDB).
      
      This fix deals with the first issue by changing Falcon in such way
      that it downgrades locks of type TL_READ_NO_INSERT to TL_READ
      (unless LOCK TABLES is being executed). This is safe since Falcon
      as any row-locker/MVCC engine has its own mechanism for handling
      concurrent changes. It should also increase concurrency in certain
      scenarios.
      
      Fixing the second factor will require much more intrusive changes
      and will be done as part of work on bug 42147 "Concurrent DML and
      LOCK TABLE ... READ for InnoDB table cause warnings in errlog".
[5 Feb 2009 11:45] Dmitry Lenev
Fix for this bug was queued into mysql-6.0-runtime and will be merged eventually into the main and mysql-6.1-fk trees.
[13 Feb 2009 7:25] Bugs System
Pushed into 6.0.10-alpha (revid:alik@sun.com-20090211182317-uagkyj01fk30p1f8) (version source revid:alik@sun.com-20090211182317-uagkyj01fk30p1f8) (merge vers: 6.0.10-alpha) (pib:6)
[24 Feb 2009 0:40] Paul DuBois
Noted in 6.0.10 changelog.

For Falcon tables, concurrent execution of a statement which in
the general case should acquire a TL_READ_NO_INSERT lock on the table
(for example multiple-table update, DML with subqueries, or
statements involving new foreign key checks) and a statement that
modifies the table might lead to warnings in the error log or even to
deadlocks.