Bug #36664 Server crash in backup_no_be test on Solaris.
Submitted: 12 May 2008 9:15 Modified: 25 Nov 2008 13:22
Reporter: Rafal Somla Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Backup Severity:S3 (Non-critical)
Version:6.0-backup OS:Solaris
Assigned to: Øystein Grøvlen CPU Architecture:Any

[12 May 2008 9:15] Rafal Somla
Description:
When running backup_no_be test on sol10-sparc-a machine server crashes aborted by signal 11. This is what server's error log shows (for some reason the log also contains debug trace output):

<...>
find_files: info: found: 4 files
my_free: my: ptr: 0xd81db8
free_root: enter: root: 0xcca088  flags: 0
my_free: my: ptr: 0xdc5d98
my_free: my: ptr: 0xcca068
ha_find_files: enter: db: '080512 11:35:46 - mysqld got signal 11 ;

Debugger shows the following call stack at crash time:

  [5] strlen(0x0, 0xfffffaf0, 0x0, 0xfeccda30, 0x0, 0x2b), at 0xfedaea90 
  [6] _ndoprnt(0x736da0, 0xfeccdb54, 0xfeccd371, 0x0, 0x0, 0x0), at 0xfee135e4 
  [7] vfprintf(0x9b9c58, 0x736d99, 0xfeccdb50, 0x7f0400, 0x7f0400, 0x7), at 0xfee15a3c 
  [8] _db_doprnt_(format = ???, ...) (optimized), at 0x647e04 (line ~1161) in "dbug.c"
=>[9] ha_find_files(thd = ???, db = ???, path = ???, wild = ???, dir = ???, files = ???) (optimized), at 0x3b0be0 (line ~3760) in "handler.cc"
  [10] find_files(thd = ???, files = ???, db = ???, path = ???, wild = ???, dir = ???) (optimized), at 0x3d9190 (line ~569) in "sql_show.cc"
  [11] make_db_list(thd = ???, files = ???, lookup_field_vals = ???, with_i_schema = ???) (optimized), at 0x3ddbc4 (line ~2723) in "sql_show.cc"
  [12] get_all_tables(thd = ???, tables = ???, cond = ???) (optimized), at 0x3de92c (line ~3302) in "sql_show.cc"
  [13] __unnamed_kEAZKItcJIkVM::open_schema_table(thd = ???, st = ???, db_list = ???) (optimized), at 0x445c58 (line ~321) in "si_objects.cc"
  [14] obs::InformationSchemaIterator::prepare_is_table(thd = ???, is_table = ???, ha = ???, orig_columns = ???, is_table_idx = ???, db_list = CLASS) (optimized), at 0x445ffc (line ~1045) in "si_objects.cc"
  [15] obs::create_is_iterator<obs::DbTablesIterator>(thd = ???, is_table_idx = ???, db_name = ???) (optimized), at 0x449e8c (line ~1452) in "si_objects.cc"
  [16] obs::get_db_tables(thd = ???, db_name = ???) (optimized), at 0x446cb0 (line ~1486) in "si_objects.cc"
  [17] Backup_info::add_db_items(this = ???, db = CLASS) (optimized), at 0x6d7d34 (line ~584) in "backup_info.cc"
  [18] Backup_info::add_dbs(this = ???, dbs = CLASS) (optimized), at 0x6d796c (line ~455) in "backup_info.cc"
  [19] execute_backup_command(thd = ???, lex = ???) (optimized), at 0x6c6c1c (line ~158) in "kernel.cc"
  [20] mysql_execute_command(thd = ???) (optimized), at 0x2cf000 (line ~2153) in "sql_parse.cc"

Thus the crash is because strlen() is called for a NULL string. 

Further investigation of the call stack shows that strlen() is called from DBUG_PRINT() inside ha_find_files() which is called with NULL database argument. The NULL database name is passed to find_files() inside make_db_list() in this line:

 2723     return (find_files(thd, files, NullS,
 2724                        mysql_data_home, NullS, 1) != FIND_FILES_OK);

Looks like find_files is supposed to accept NULL as database name.

How to repeat:
Run backup_no_be test on sol10-sparc-a

Suggested fix:
Either don't call ha_find_files() from find_files() in case db=NULL or modify DBUG_PRINT() call in ha_find_files() to take care of NULL db parameter (db ? db : "NULL").
[29 Oct 2008 16:02] Øystein Grøvlen
I have not able to reproduce this on Solaris.  The backup_no_be test is currently disabled due to valgrind issues (BUG#38023).  However, I can not find that this test ever failed in PushBuild before it was disabled.  Hence, I plan to enable it again to check whether this bug can be reproduced in PushBuild.
[30 Oct 2008 10:42] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/57433

2722 oystein.grovlen@sun.com	2008-10-30
      BUG#36664 Server crash in backup_no_be test on Solaris.
      
      Enabling this test to see if this is still a problem.  
      (Not able to reproduce it myself).
      
      Also, cleaning up some debug settings in test to avoid generating large volumes of load.
      
      Note that this test was disabled due to valgrind issues, but I could not
      find that these issues had ever occurred in Pushbuild.
[3 Nov 2008 15:46] Rafal Somla
I think it is ok to push this patch to see if test still fails.
Note that the issue is not solved yet - we need to collect more data.
[7 Nov 2008 13:57] Øystein Grøvlen
Test has been enabled.  Awaiting pushbuild to see whether it can be reproduced.
[14 Nov 2008 14:49] Bugs System
Pushed into 6.0.9-alpha  (revid:oystein.grovlen@sun.com-20081030104148-2ibzbkqwu8um2dtz) (version source revid:jorgen.loland@sun.com-20081114134411-xypyf8wyjc2nm3ly) (pib:5)
[25 Nov 2008 13:22] Øystein Grøvlen
Test has never failed in PB2 on Solaris since it was re-enabled.