Bug #49206 Deadlock between BACKUP, the locking thread and DDL DML
Submitted: 30 Nov 2009 12:00 Modified: 9 Dec 2009 10:37
Reporter: Philip Stoev Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Backup Severity:S2 (Serious)
Version:6.0-codebase-bugfixing OS:Any
Assigned to: Philip Stoev CPU Architecture:Any

[30 Nov 2009 12:00] Philip Stoev
Description:
When executing a workload involving concurrent BACKUP/RESTORE and DDL statements, the server deadlocked with the following processlist:

*************************** 10. row ***************************
     Id: 29
   User: root
   Host: localhost:38485
     db: test
Command: Killed
   Time: 910
  State: Locking thread: waiting until locking thread is done
   Info: BACKUP DATABASE testdb_N TO '/tmp/gentest5420-1259581306.08185-0.47572040623573.tmp'
*************************** 11. row ***************************
     Id: 71
   User: system user
   Host:
     db: NULL
Command: Daemon
   Time: 910
  State: MyISAM backup: holding table locks
   Info: MyISAM driver locking thread

the rest of the threads are doing CREATE, DML and LOCK TABLES. KILL-ing BACKUP did not cause the deadlock to be resolved. The backtraces are:

#0  0x000000315b00b309 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000114e54b in safe_cond_wait (cond=0x7fb45c143a30, mp=0x7fb45c143990, file=0x158dd8d "be_thread.cc", line=355) at thr_mutex.c:423
#2  0x000000000132f946 in Locking_thread_st::wait_until_locking_thread_dies (this=0x7fb45c1438c0) at be_thread.cc:355
#3  0x000000000132be59 in ~Backup (this=0x7fb45c1437f0) at be_default.cc:155
#4  0x000000000132d2d1 in default_backup::Backup::free (this=0x7fb45c1437f0) at ../../sql/backup/be_default.h:79
#5  0x00000000013234c2 in ~Backup_pump (this=0x7fb45c121c00) at data_backup.cc:1173
#6  0x000000000132822b in ~Pump (this=0x7fb45c121c00) at data_backup.cc:301
#7  0x0000000001323c23 in backup::Scheduler::remove_pump (this=0x7fb4600f6ff0, p=@0x7fb4600f6f80) at data_backup.cc:1052
#8  0x0000000001324df3 in backup::Scheduler::step (this=0x7fb4600f6ff0) at data_backup.cc:938
#9  0x0000000001326f2c in backup::write_table_data (thd=0x39f7df8, info=@0x7fb45c0f52b0, s=@0x7fb45c0d5be0) at data_backup.cc:713
#10 0x000000000130f97e in Backup_restore_ctx::do_backup (this=0x7fb4600f71e0) at kernel.cc:1365
#11 0x0000000001316544 in execute_backup_command (thd=0x39f7df8, lex=0x39f96c8, backupdir=0x7fb4600f8160, overwrite=false, skip_gap_event=false)
    at kernel.cc:261
#12 0x0000000000803c32 in mysql_execute_command (thd=0x39f7df8) at sql_parse.cc:2476
#13 0x00000000008159a9 in mysql_parse (thd=0x39f7df8,
    inBuf=0x38f5900 "BACKUP DATABASE testdb_N TO '/tmp/gentest5420-1259581306.08185-0.47572040623573.tmp'", length=84, found_semicolon=0x7fb4600f8ef0)
    at sql_parse.cc:5975
#14 0x00000000008173f8 in dispatch_command (command=COM_QUERY, thd=0x39f7df8,
    packet=0x3a465d9 "BACKUP DATABASE testdb_N TO '/tmp/gentest5420-1259581306.08185-0.47572040623573.tmp' ", packet_length=85) at sql_parse.cc:1076
#15 0x000000000081a358 in do_command (thd=0x39f7df8) at sql_parse.cc:758
#16 0x00000000007f0ef6 in handle_one_connection (arg=0x39f7df8) at sql_connect.cc:1164
#17 0x000000315b0073da in start_thread () from /lib64/libpthread.so.0
#18 0x000000315a4e627d in clone () from /lib64/libc.so.6

#0  0x000000315b00b309 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000114e54b in safe_cond_wait (cond=0x7fb45c132bb8, mp=0x1c4e3e0, file=0x14fec00 "myisam_backup_engine.cc", line=871) at thr_mutex.c:423
#2  0x0000000000f289d4 in myisam_backup::Backup::lock_tables_TL_READ_NO_INSERT (this=0x7fb45c132970) at myisam_backup_engine.cc:871
#3  0x0000000000f28d0b in myisam_backup_separate_thread_for_locking (arg=0x7fb45c132970) at myisam_backup_engine.cc:901
#4  0x000000315b0073da in start_thread () from /lib64/libpthread.so.0
#5  0x000000315a4e627d in clone () from /lib64/libc.so.6

#0  0x000000315b00b309 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000000114e54b in safe_cond_wait (cond=0x1a488c0, mp=0x1a47de0, file=0x13d9340 "lock.cc", line=1350) at thr_mutex.c:423
#2  0x00000000007c122c in wait_if_global_read_lock (thd=0x38ae528, abort_on_refresh=false, is_not_commit=true) at lock.cc:1350
#3  0x000000000080b26e in mysql_execute_command (thd=0x38ae528) at sql_parse.cc:3643
#4  0x00000000008159a9 in mysql_parse (thd=0x38ae528, inBuf=0x38d5f60 "LOCK TABLES testdb_S . t1_merge2_S  AS x  WRITE", length=47,
    found_semicolon=0x7fb4601fdef0) at sql_parse.cc:5975
#5  0x00000000008173f8 in dispatch_command (command=COM_QUERY, thd=0x38ae528, packet=0x38c6149 "LOCK TABLES testdb_S . t1_merge2_S  AS x  WRITE",
    packet_length=47) at sql_parse.cc:1076
#6  0x000000000081a358 in do_command (thd=0x38ae528) at sql_parse.cc:758
#7  0x00000000007f0ef6 in handle_one_connection (arg=0x38ae528) at sql_connect.cc:1164
#8  0x000000315b0073da in start_thread () from /lib64/libpthread.so.0
#9  0x000000315a4e627d in clone () from /lib64/libc.so.6

How to repeat:
A reproducible test case will be uploaded as soon as it is available. In the meantime, a core file and a binary will be uploaded for this bug.

The core and the binary are from:

revision-id: holyfoot@mysql.com-20091125123004-ylmxn4e1lckv1o2b
date: 2009-11-25 16:30:04 +0400
build-date: 2009-11-30 14:00:18 +0200
revno: 3737
branch-nick: 6.0-codebase-bugfixing
[30 Nov 2009 12:19] Philip Stoev
Core and binary : http://mysql-systemqa.s3.amazonaws.com/var-bug49206.zip
[30 Nov 2009 12:39] Rafal Somla
This might be fixed by BUG#48930 - currently queued for push to 6.0-codebase.
[9 Dec 2009 10:38] Philip Stoev
Not repeatable with 6.0-backup . Sorry for using 6.0-codebase-bugfixing for testing in first place, it was not intentional.