Bug #44423 Default backup driver crashes server if error encountered.
Submitted: 23 Apr 2009 9:57 Modified: 29 Jun 2009 14:39
Reporter: Rafal Somla Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Backup Severity:S3 (Non-critical)
Version:6.0 source OS:Any
Assigned to: Rafal Somla CPU Architecture:Any

[23 Apr 2009 9:57] Rafal Somla
Description:
BACKUP statement which uses the default backup driver can crash the server if driver reports error from get_data() method. In that case the driver is shut down and when the table locking thread tryies to close locked tables an assertion is violated:

#6  0x401db590 in __assert_fail () from /lib/libc.so.6
#7  0x08354d70 in close_thread_table (thd=0x99445a8, table_ptr=0x99445f4) at sql_base.cc:1506
#8  0x083553b4 in close_open_tables (thd=0x99445a8) at sql_base.cc:1228
#9  0x08355791 in close_thread_tables (thd=0x99445a8, is_back_off=false) at sql_base.cc:1476
#10 0x08908cb2 in backup_thread_for_locking (arg=0x9939a90) at be_thread.cc:211
#11 0x40036f3b in start_thread () from /lib/libpthread.so.0

The assertion is:

(gdb) f 7
#7  0x08354d70 in close_thread_table (thd=0x99445a8, table_ptr=0x99445f4) at sql_base.cc:1506
1506      DBUG_ASSERT(!table->file || table->file->inited == handler::NONE);

(gdb) p table->file->inited
$1 = handler::RND

The locking thread is killed by the destructor of the default backup driver, which is called herel:

#0  0x4003a8f0 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#1  0x0885431a in safe_cond_wait (cond=0x9939b80, mp=0x9939b20, file=0x8a7c80d "be_thread.cc", line=356) at thr_mutex.c:424
#2  0x08907e0d in Locking_thread_st::wait_until_locking_thread_dies (this=0x9939a90) at be_thread.cc:356
#3  0x089064b0 in ~Backup (this=0x9939a20) at be_default.cc:151
#4  0x08906b32 in default_backup::Backup::free (this=0x9939a20) at be_default.h:79
#5  0x0890256a in ~Backup_pump (this=0x993bce8) at data_backup.cc:1161
#6  0x08904609 in backup::Scheduler::Pump::~Pump () at data_backup.cc:301
#7  0x08902953 in backup::Scheduler::remove_pump (this=0x4aaf9de0, p=@0x4aaf9d90) at data_backup.cc:1040
#8  0x0890318c in backup::Scheduler::step (this=0x4aaf9de0) at data_backup.cc:926
#9  0x08903ecb in backup::write_table_data (thd=0x96d7890, info=@0x9932e60, s=@0x97e3238) at data_backup.cc:702
#10 0x088fa069 in Backup_restore_ctx::do_backup (this=0x4aaf9f18) at kernel.cc:1199
#11 0x088fc934 in execute_backup_command (thd=0x96d7890, lex=0x96d87d4, backupdir=0x4aafa7b4, overwrite=false, skip_gap_event=false) at kernel.cc:208
#12 0x0830358e in mysql_execute_command (thd=0x96d7890) at sql_parse.cc:2465
#13 0x0830b4d7 in mysql_parse (thd=0x96d7890, inBuf=0x97bf630 "BACKUP DATABASE db1 TO 'db1.bkp'", length=32, found_semicolon=0x4aafae90)
    at sql_parse.cc:5909
#14 0x0830bf0b in dispatch_command (command=COM_QUERY, thd=0x96d7890, packet=0x97dc231 "BACKUP DATABASE db1 TO 'db1.bkp'", packet_length=32)
    at sql_parse.cc:1049

How to repeat:
Compile server with the following modification (in debug mode):

=== modified file 'sql/backup/be_default.cc'
--- sql/backup/be_default.cc    2009-03-25 13:21:35 +0000
+++ sql/backup/be_default.cc    2009-04-21 05:46:33 +0000
@@ -410,6 +410,7 @@ result_t Backup::get_data(Buffer &buf)
     while (last_read_res == HA_ERR_RECORD_DELETED)
       last_read_res= hdl->rnd_next(cur_table->record[0]);
     DBUG_EXECUTE_IF("SLEEP_DRIVER", sleep(4););
+    DBUG_EXECUTE_IF("default_rnd_next_error", last_read_res=-1;);
     /*
       If we are end of file, stop the read process and signal the
       backup algorithm that we're done. Turn get_next_table mode on.

Then run the following test case against the modified server:

CREATE DATABASE db1;
CREATE TABLE db1.t1(a int) ENGINE=Archive;
INSERT INTO db1.t1 VALUES (1);
SET SESSION debug='+d,default_rnd_next_error';
BACKUP DATABASE db1 TO 'db1.bkp';

The BACKUP statement should crash the server.
[29 Apr 2009 17:55] Rafal Somla
Here is a patch which seems to fix the problem. After applying it the server does not crash, but an error is signalled by the backup driver and reported by BACKUP statement. The trick is that in case an error is detected when reading table rows, the table should be closed.

=== modified file 'sql/backup/be_default.cc'
--- sql/backup/be_default.cc    2009-03-25 13:21:35 +0000
+++ sql/backup/be_default.cc    2009-04-28 05:08:25 +0000
@@ -429,7 +430,10 @@ result_t Backup::get_data(Buffer &buf)
         locking_thd->kill_locking_thread();
     }
     else if (last_read_res != 0)
+    {
+      end_tbl_read();
       DBUG_RETURN(ERROR);
+    }
     else
     {
       /*
[8 May 2009 8:27] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/73641

2808 Rafal Somla	2009-05-06
      Bug #44423  - Default backup driver crashes server if error encountered.
      
      The problem happend if rnd_next() method called by default backup driver 
      signalled error. In that case driver did not call ha_rnd_end() which 
      should match ha_rnd_init() called at the beginning of a table scan.
      
      Note: To see the crash, comment out the call to end_tbl_read(), compile 
      and run backup_default_debug test.
     @ mysql-test/suite/backup/t/backup_default_debug.test
        Add test scenario for this bug.
     @ sql/backup/be_default.cc
        - Add a call to end_tbl_read() in case rnd_next() signalls error. 
        This calls ha_rnd_end() to match the initial ha_rnd_init().
        
        - Add error injection point to simulate error from the rnd_next() call.
[8 May 2009 15:34] Ingo Strüwing
Approved implied you follow my request sent by email.
[21 May 2009 17:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/74727

2705 Rafal Somla	2009-05-18
      Bug #44423  - Default backup driver crashes server if error encountered.
      
      The problem happend if rnd_next() method called by default backup driver 
      signalled error. In that case driver did not call ha_rnd_end() which 
      should match ha_rnd_init() called at the beginning of a table scan.
      
      Note: To see the crash, comment out the call to end_tbl_read(), compile 
      and run backup_default_debug test.
     @ mysql-test/suite/backup/t/backup_default_debug.test
        Add test scenario for this bug.
     @ sql/backup/be_default.cc
        - Add a call to end_tbl_read() in case rnd_next() signalls error. 
        This calls ha_rnd_end() to match the initial ha_rnd_init().
        
        - Add error injection point to simulate error from the rnd_next() call.
[25 May 2009 5:38] Rafal Somla
Pushed to mysql-6.0-backup tree.
revid:rafal.somla@sun.com-20090518154118-4amu0b2va1nsxbwc
[3 Jun 2009 7:09] Jørgen Løland
Merged to azalea June 2
[29 Jun 2009 14:39] Paul DuBois
No changelog entry needed. Not in any released version.