Bug #34858 Server crashes by backup tests
Submitted: 26 Feb 2008 20:10 Modified: 18 Jul 2008 16:46
Reporter: Alexander Nozdrin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Backup Severity:S1 (Critical)
Version:6.0-runtime OS:Any
Assigned to: Chuck Bell CPU Architecture:Any

[26 Feb 2008 20:10] Alexander Nozdrin
Description:
After merge the main 6.0 into 6.0-runtime the server started to crash
by backup tests.

Not sure about all, but at least some tests crashes the server
by the following assert in handler.c:

THD *handler::ha_thd(void) const
{
  DBUG_ASSERT(!table || !table->in_use || table->in_use == current_thd);
  return (table && table->in_use) ? table->in_use : current_thd;
}

I'm going to disable tests until the server will be fixed.

How to repeat:
I've slightly modified the function:

------------------------------------------------------
THD *handler::ha_thd(void) const
{
  THD *thd= current_thd;
  DBUG_ASSERT(!table || !table->in_use || table->in_use == thd);
  return (table && table->in_use) ? table->in_use : thd;
}
------------------------------------------------------

$ mtr --gdb backup_errors

gdb log:
------------------------------------------------------
(gdb) c
Continuing.
[New Thread 0x408c3950 (LWP 13447)]
[Thread 0x408c3950 (LWP 13447) exited]
[New Thread 0x408c3950 (LWP 13448)]
[Thread 0x408c3950 (LWP 13448) exited]
[New Thread 0x408c3950 (LWP 13449)]
mysqld: handler.cc:1965: THD* handler::ha_thd() const: Assertion `!table || !table->in_use || table->in_use == thd' failed.

Program received signal SIGABRT, Aborted.
0x00002ae22b0dfb45 in raise () from /lib64/libc.so.6

(gdb) bt    
#0  0x00002ae22b0dfb45 in raise () from /lib64/libc.so.6
#1  0x00002ae22b0e10e0 in abort () from /lib64/libc.so.6
#2  0x00002ae22b0d907f in __assert_fail () from /lib64/libc.so.6
#3  0x0000000000854476 in handler::ha_thd (this=0x1691708) at handler.cc:1965
#4  0x000000000085ce87 in handler::mark_trx_read_write (this=0x1691708)
    at handler.cc:2968
#5  0x0000000000854644 in handler::ha_write_row (this=0x1691708, 
    buf=0x1691b08 "Ç") at handler.cc:5359
#6  0x0000000000d3d3e6 in report_ob_progress (backup_id=0, 
    object=0xee6c6d "backup kernel", start=0, stop=0, size=0, progress=0, 
    error_num=0, notes=0x167a208 "error") at backup_progress.cc:905
#7  0x0000000000d3e706 in report_ob_state (backup_id=0, state=BUP_ERRORS)
    at backup_progress.cc:796
#8  0x0000000000d30d68 in execute_backup_command (thd=0x161b5c8, lex=0x161d038)
    at kernel.cc:241
#9  0x000000000072beb7 in mysql_execute_command (thd=0x161b5c8)
    at sql_parse.cc:2124
#10 0x0000000000733e55 in mysql_parse (thd=0x161b5c8, 
    inBuf=0x16848e8 "RESTORE FROM 'test.bak'", length=23, 
    found_semicolon=0x40881f10) at sql_parse.cc:5609
#11 0x0000000000734a06 in dispatch_command (command=COM_QUERY, thd=0x161b5c8, 
    packet=0x1674e29 "RESTORE FROM 'test.bak'", packet_length=23)
    at sql_parse.cc:1030
#12 0x0000000000735d8a in do_command (thd=0x161b5c8) at sql_parse.cc:722
#13 0x0000000000722e6a in handle_one_connection (arg=0x161b5c8)
    at sql_connect.cc:1122
#14 0x00002ae229cbc020 in start_thread () from /lib64/libpthread.so.0
#15 0x00002ae22b173f8d in clone () from /lib64/libc.so.6
#16 0x0000000000000000 in ?? ()

(gdb) f 3
#3  0x0000000000854476 in handler::ha_thd (this=0x1691708) at handler.cc:1965
1965      DBUG_ASSERT(!table || !table->in_use || table->in_use == thd);

(gdb) p table
$1 = (st_table *) 0x168fa48

(gdb) p table->in_use
$2 = (THD *) 0x1692f48

(gdb) p thd
$3 = (THD *) 0x161b5c8

(gdb) p table->in_use
$4 = (THD *) 0x1692f48

------------------------------------------------------
[12 Mar 2008 17:55] Chuck Bell
Problems identified. Patch forthcoming...
[12 Mar 2008 18:41] Chuck Bell
Patch ready for review.

http://lists.mysql.com/commits/43873
[14 Mar 2008 18:10] Chuck Bell
Original patch rejected. Back to the drawing board. :) More details to follow.
[14 Mar 2008 19:53] Chuck Bell
Problem
-------
The backup progress tables can no longer operate as they do now using the locking thread to open the tables and updating data in another. Changes to the handler code prohibit such operations.

It was decided to change the implementation of the backup progress tables to use the logging functionality similar to how the query log does now.

Additionally, work is underway to enhance the backup progress tables functionality (WL#4303).

Proposed Solution
-----------------
Change the implementation of the backup progress tables to use the logging mechanism but do not change any of it's functionality. 

This includes leaving the names, data types, values, etc. the same for both the online_backup and online_backup_progress tables as well as the basic interface as used by the backup kernel code intact (see backup_progress.h/.cc). Thus, there will be no changes to the kernel code.

Since the implementation will be changed to use the logging mechanism, it is best (for the reasons above) to hard code the settings to write to tables for now. If it is decided to allow customization of the backup progress logs, that can be done in WL#4303. 

The intention is to change the implementation by removing the use of the be_thread mechanism and use the existing logging functions instead.
[17 Mar 2008 19:33] Chuck Bell
There is one small change needed in the interface between the kernel and the report_ob_* methods: I have to add the THD object.
[17 Mar 2008 21:42] Chuck Bell
Ok, here's another unavoidable feature change...there are no auto_increment fields permitted. I will have to invent a new mechanism. More details to come.
[17 Mar 2008 21:56] Chuck Bell
Solution found. Leave tables in MyISAM for now. Made notes in code.
[18 Mar 2008 18:13] Chuck Bell
A patch is ready for review. See:

http://lists.mysql.com/commits/44196

This patch simply changes the implementation of the backup progress tables to backup logs using the logging functionality (same as general log) with some exceptions (see below).

Note that there are some known limitations for this code.

* The logs are written to tables only. 
* The tables are unchanged from previous code/design.
* The backup logs are not affected by --log, --log-option.
* There is a FLUSH BACKUP LOGS, but not a PURGE BACKUP LOGS.

The new logs behave the same was as the progress tables did except that now that they are logs, you cannot delete from them.
[18 Mar 2008 18:13] Chuck Bell
A patch is ready for review. See:

http://lists.mysql.com/commits/44196

This patch simply changes the implementation of the backup progress tables to backup logs using the logging functionality (same as general log) with some exceptions (see below).

Note that there are some known limitations for this code.

* The logs are written to tables only. 
* The tables are unchanged from previous code/design.
* The backup logs are not affected by --log, --log-option.
* There is a FLUSH BACKUP LOGS, but not a PURGE BACKUP LOGS.

The new logs behave the same was as the progress tables did except that now that they are logs, you cannot delete from them.
[19 Mar 2008 17:03] Chuck Bell
The patch below represents a proposed solution to fix the backup progress table problems. This solution uses the performance table open/close (the same as what is used in logs) to open and close the backup progress tables.

http://lists.mysql.com/commits/44247

There is just one small limitation after this patch: users cannot delete from the progress tables. We will have to add a PURGE command for this (which is already requested under BUG#33364).
[19 Mar 2008 17:05] Chuck Bell
If the above solution is not acceptable, I have prepared a separate patch that simply removes the backup progress reporting.

http://lists.mysql.com/commits/44232
[20 Mar 2008 10:37] Rafal Somla
Good to push.
[20 Mar 2008 14:55] Chuck Bell
Patch queued. Minor adjustments from review only.

http://lists.mysql.com/commits/44290
[15 Apr 2008 17:58] Bugs System
Pushed into 6.0.5-alpha
[18 Jul 2008 16:46] Paul DuBois
Noted in 6.0.5 changelog.

Logging to the progress tables used by BACKUP DATABASE and RESTORE
caused a server crash.