Bug #40944 Backup: crash after myisampack
Submitted: 21 Nov 2008 22:31 Modified: 17 Nov 17:26
Reporter: Peter Gulutzan
Status: Patch queued
Category:Server: Backup Severity:S3 (Non-critical)
Version:6.0.9-alpha-debug OS:Linux (SUSE 10.0 / 32-bit)
Assigned to: Ingo Strüwing Target Version:
Tags: mdl
Triage: Triaged: D1 (Critical)

[21 Nov 2008 22:31] Peter Gulutzan
Description:
I create a MyISAM table with a few rows.
I run 'myisampack' and 'myisamchk' utilities.
I backup.
I restore.
I select from the table.
Crash.

How to repeat:
/* On the client, in order to get a table
   that myisampack will consider is worth
   compressing, create and insert 128 rows. */
use test
create table t1 (s1 varchar(5), s2 int);
create index i on t1 (s1,s2);
insert into t1 values ('a',1);
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
quit

/* Compress the table with conventional instructions.
   Your path to the utility programs may vary. */

/usr/local/mysql/bin/myisampack /usr/local/mysql/var/test/t1.MYI 
/usr/local/mysql/bin/myisamchk -rq /usr/local/mysql/var/test/t1.MYI

/* Restart the client, backup, restore,
   and then do a SELECT with a WHERE. */

use test
backup database test to '2';
restore from '2';
select count(*) from t1 where s1 < 5;
[22 Nov 2008 0:08] Miguel Solorzano
Thank you for the bug report.

6.0/libexec/mysqld(my_print_stacktrace+0x32)[0xdc23ae]
6.0/libexec/mysqld(handle_segfault+0x2a6)[0x73b715]
/lib/libpthread.so.0[0x7fbebcef00f0]
/lib/libc.so.6(gsignal+0x35)[0x7fbebbb8efd5]
/lib/libc.so.6(abort+0x183)[0x7fbebbb90b43]
/lib/libc.so.6(__assert_fail+0xe9)[0x7fbebbb87d49]
6.0/libexec/mysqld[0xd979b1]
6.0/libexec/mysqld[0xd98104]
6.0/libexec/mysqld(key_cache_read+0x450)[0xd9a556]
6.0/libexec/mysqld(_mi_fetch_keypage+0xb5)[0x9d09c9]
6.0/libexec/mysqld(_mi_search_first+0x92)[0x9cf080]
6.0/libexec/mysqld(mi_rnext+0x1d4)[0x9ca280]
6.0/libexec/mysqld(mi_rfirst+0x71)[0x9ea389]
6.0/libexec/mysqld(_ZN9ha_myisam11index_firstEPh+0x66)[0x9b5ba4]
6.0/libexec/mysqld[0x7bddf1]
6.0/libexec/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x158)[0x7c1498]
6.0/libexec/mysqld[0x7d0506]
6.0/libexec/mysqld(_ZN4JOIN4execEv+0x23a4)[0x7e55d6]
6.0/libexec/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x32f)[0x7e0178]
6.0/libexec/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x1de)[0x7e58f4]
6.0/libexec/mysqld[0x74cc3c]
6.0/libexec/mysqld(_Z21mysql_execute_commandP3THD+0x816)[0x74e257]
6.0/libexec/mysqld(_Z11mysql_parseP3THDPKcjPS2_+0x273)[0x7562b8]
6.0/libexec/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xa69)[0x756eb5]
6.0/libexec/mysqld(_Z10do_commandP3THD+0x224)[0x758427]
6.0/libexec/mysqld(handle_one_connection+0x11c)[0x745503]
/lib/libpthread.so.0[0x7fbebcee83ea]
/lib/libc.so.6(clone+0x6d)[0x7fbebbc42c6d]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x29bdd80 = select count(*) from t1 where s1 < 5
thd->thread_id=2
thd->killed=NOT_KILLED
[26 Nov 2008 14:49] Jorgen Loland
I tried checksum:

<initial script, restart server>
mysql> backup database test to '1.bup';
+-----------+
| backup_id |
+-----------+
| 270       | 
+-----------+
1 row in set (0.13 sec)

mysql> checksum table t1;
+---------+------------+
| Table   | Checksum   |
+---------+------------+
| test.t1 | 3189812352 | 
+---------+------------+
1 row in set (0.00 sec)

mysql> restore from '1.bup' overwrite;
+-----------+
| backup_id |
+-----------+
| 271       | 
+-----------+
1 row in set (0.02 sec)

mysql> checksum table t1;
+---------+----------+
| Table   | Checksum |
+---------+----------+
| test.t1 |        0 | 
+---------+----------+
1 row in set (0.00 sec)
[27 Nov 2008 10:22] Jorgen Loland
Ingo suggested to try 'flush tables' after restore. That seems to do the trick:

mysql> restore from '2' overwrite;
mysql> flush tables;
mysql> select count(*) from t1;
+----------+
| count(*) |
+----------+
|      128 | 
+----------+
1 row in set (0.00 sec)
[3 Dec 2008 17:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/60508

2737 Ingo Struewing	2008-12-03
      Bug#40944 - Backup: crash after myisampack
      
      After restore of a compressed MyISAM table, the server crashed
      when the table was accessed.
      
      The problem was twofold.
      
      1. The crash happened in the MyISAM keycache. It did not handle
      an error in reading a block correctly. This is now fixed in
      key_cache_read(), key_cache_insert(), and key_cache_write().
      Their BLOCK_ERROR handling does now follow a common pattern.
      Erroneous blocks are no longer put to the LRU ring, but freed.
      
      2. After fixing the above, accesses to a restored compressed table
      failed with error messages about a corrupted table. Restore of a
      table works by dropping the old table if exists, creating a new
      one, and loading the data from the backup image. The MyISAM
      native driver writes directly to the table files, what was there
      at backup time. This works even for compressed tables, as long as
      the table is freshly opened after loading the data. But RESTORE
      needs to keep the table write locked until it is complete. This
      includes keeping the table open. When restore is done, it unlocks
      the tables and requests close. But our table cache does not
      perform a full close. It marks it unused and keeps it open in the
      cache. Later statements reuse the open table. But when the MyISAM
      table format changes, we must force a full close. Since this is
      difficult to detect, we do now always force close of restored
      MyISAM tables, if restored through the native driver.
[19 Dec 2008 11:18] Oystein Grovlen
Patch approved pending the fix of warnings in new tests.
[30 Dec 2008 13:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/62444

2745 Ingo Struewing	2008-12-30
      Bug#40944 - Backup: crash after myisampack
      
      After restore of a compressed MyISAM table, the server crashed
      when the table was accessed.
      
      The problem was twofold.
      
      1. The crash happened in the MyISAM keycache. It did not handle
      an error in reading a block correctly. This is now fixed in
      key_cache_read(), key_cache_insert(), and key_cache_write().
      Their BLOCK_ERROR handling does now follow a common pattern.
      Erroneous blocks are no longer put to the LRU ring, but freed.
      
      2. After fixing the above, accesses to a restored compressed table
      failed with error messages about a corrupted table. Restore of a
      table works by dropping the old table if exists, creating a new
      one, and loading the data from the backup image. The MyISAM
      native driver writes directly to the table files, what was there
      at backup time. This works even for compressed tables, as long as
      the table is freshly opened after loading the data. But RESTORE
      needs to keep the table write locked until it is complete. This
      includes keeping the table open. When restore is done, it unlocks
      the tables and requests close. But our table cache does not
      perform a full close. It marks it unused and keeps it open in the
      cache. Later statements reuse the open table. But when the MyISAM
      table format changes, we must force a full close. Since this is
      difficult to detect, we do now always force close of restored
      tables.
[18 Jan 11:58] Ingo Strüwing
Queued to 6.0-backup.
[2 Feb 17:07] Bugs System
Pushed into 6.0.10-alpha (revid:sergefp@mysql.com-20090202090240-dlkxhmc1asrar5rl)
(version source revid:sergefp@mysql.com-20090129100938-qvke7a9krg24l8pl) (merge vers:
6.0.10-alpha) (pib:6)
[6 Feb 10:18] Guilhem Bichot
Still exists. I have 6.0-backup at revision
jorgen.loland@sun.com-20090203074809-1l4sh01tklb3cxfm

I'm attaching a patch (which adds some DEBUG_SYNC and a repeatable way to get the
problem). Apply the patch, rebuild and run
./mtr backup_myisam
it fails with
mysqltest: At line 60: query 'reap' failed: 126: Incorrect key file for table
'./mysql_db1/t1.MYI'; try to repair it
[6 Feb 10:19] Guilhem Bichot
patch to repeat bug under deterministic concurrency

Attachment: bug40944.txt (text/plain), 1.69 KiB.

[6 Feb 10:24] Guilhem Bichot
here is an explanation of why the bug is still there:
http://lists2.mysql.com/commits/65442
[10 Feb 11:08] Ingo Strüwing
Waiting for the runtime team to provide an open_and_lock_tables(), which accepts name
locks (exclusive mdl locks) on the tables.
[23 Feb 13:12] Ingo Strüwing
Depends on Bug#42895 (Please implement a table open interface for name-locked tables)
[8 Jun 14:01] Ingo Strüwing
Thank you, Guilhem, for a repeatable test case.

RESTORE drops the tables to be restored, if they exist.

RESTORE creates new tables from the CREATE TABLE statement from
the backup image file.

RESTORE opens and locks the tables to be restored with TL_WRITE.

SELECT opens and tries to lock one of the tables being restored.
It blocks in thr_multi_lock().

RESTORE loads all the table data.

RESTORE tries to close all restored tables with
close_cached_tables(..., table_list, ...).
Here, the table share version for all tables in the list is reset.
The reason for this operation is that native restore drivers could
restore the tables in a different format, e.g. compressed.
Every thread that uses the table after RESTORE must use a freshly
opened table, so that the correct format is loaded in the memory
structures (here the MyISAM share).

RESTORE closes (and unlocks) all tables open by its own thread.

SELECT awakes from thr_multi_lock(). The table share version
is not checked again. It happily uses the still open TABLE with
the old MyISAM share information.

We need a lock, that prevents other statements to even open the
tables in restore. This is the intention for Bug#42895/WL#4844.
[17 Nov 17:26] Ingo Strüwing
The patch for Bug#42895/Wl#4844 (Implement a locking scheme for RESTORE) solves this bug
too. Queued to mysql-6.0-backup.