Bug #25523 Corruption of MyISAM tables
Submitted: 10 Jan 2007 14:45 Modified: 29 Jan 2007 12:58
Reporter: Herald van der Breggen Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S1 (Critical)
Version:5.0.26 OS:Linux (Linux CentOS 4.x)
Assigned to: CPU Architecture:Any
Tags: bfsm_2007_01_18, corruption, myisam

[10 Jan 2007 14:45] Herald van der Breggen
Description:
We have a master database with tens of slaves. Since the upgrade from 4.0 to 5.0.26 we see at random on some slave, some table from time tot time corruption. It seems to come out of the blue. There is no message in the mysql log file, except that a table is corrupt. It happens on 64 bit machines as well as on 32 bit machines.

Looking more in detail, we can see some characteristics.

So far we have seen corruption *only* on slaves.

Another strange thing that might be related is that a table on some slaves is loosing records which we have created with a nightly bulk "replace into". It is a table with language_code as part of the primary key. Only te records that are created with the bulk replace into are suffering from this (though not on al slaves), the other records (recods that are created with a normal insert or updated with a normal update) have no problem. This table is one of the tables we have seen corruption on.

The other tables the suffer from frequent corruption have also records that are created with "replace into", though these are not bulk queries but one query per  row.

We have not seen corruption on slaves that have no read queries (replication-only slaves).

It looks like the chance for corruption is is increasing when there are long running select queries.

How to repeat:
We did not succeed in creating a recipe for creating a corrupted table, though it happens on a daily basis (yesterday thee times and today two times sofar).

Suggested fix:
I wish I knew...
[10 Jan 2007 14:53] MySQL Verification Team
Herald, please post SHOW CREATE TABLE <tablename> and SHOW TABLE STATUS LIKE 'tablename' from each table that gets corrupted.
Also, post a sample REPLACE INTO query,so we can try make a testcase.
Thanks,
[10 Jan 2007 17:06] Herald van der Breggen
For what it's worth, B_Hotel just got corrupted again. The output of a repair says:

mysql> repair table B_Hotel;
+------------+--------+------------+-------------------------------------------------------+
| Table      | Op     | Msg_type   | Msg_text                                              |
+------------+--------+------------+-------------------------------------------------------+
| bp.B_Hotel | repair | infoirotel | Key 17 - Found wrong stored record at 23230548        |
| bp.B_Hotel | repair | infoirotel | Key 17 - Found wrong stored record at 23230548        | 
| bp.B_Hotel | repair | info       | Key 20 - Found wrong stored record at 23230548        | 
| bp.B_Hotel | repair | info       | Key 21 - Found wrong stored record at 23230548        | 
| bp.B_Hotel | repair | info       | Key 1 - Found wrong stored record at 23230548         | 
| bp.B_Hotel | repair | info       | Found block that points outside data file at 23230752 | 
| bp.B_Hotel | repair | warning    | Number of rows changed from 39662 to 39812            | 
| bp.B_Hotel | repair | status     | OK                                                    | 
+------------+--------+------------+-------------------------------------------------------+
8 rows in set (28.65 sec)

I have seen also cases that the number of rows were reduced and the table got corrupted very soon again. The normal procedure we have followed since then, is copy the the myisam files from another slave (after stopping replication and after flushing the tables of course). That happened to work much better than the repair (and much faster for large tables)

Yesterday evening I reduced myisam_repair_threads from 2 to 1. Maybe that improved the quality of the repair table.
[10 Jan 2007 18:22] Herald van der Breggen
The same table (B_Hotel) crashed again within two hours after the repair. This is consistent with earlier repair attemps. Getting fresh MyISAM files from another slave usually helps for a longer  period.
[10 Jan 2007 21:17] Herald van der Breggen
After a second repair the table got corrupted again. I did two check table statements one after another with small interval (one or two seconds):

mysql> check table B_Hotel;
+------------+-------+----------+---------------------------------------------------------+
| Table      | Op    | Msg_type | Msg_text                                                |
+------------+-------+----------+---------------------------------------------------------+
| bp.B_Hotel | check | warning  | Size of datafile is: 23191980       Should be: 23191852 | 
| bp.B_Hotel | check | status   | OK                                                      | 
+------------+-------+----------+---------------------------------------------------------+
2 rows in set (1.42 sec)

mysql> check table B_Hotel;
+------------+-------+----------+---------------------------------------------------------+
| Table      | Op    | Msg_type | Msg_text                                                |
+------------+-------+----------+---------------------------------------------------------+
| bp.B_Hotel | check | warning  | Size of datafile is: 23191980       Should be: 23191860 | 
| bp.B_Hotel | check | error    | Wrong bytesec: 65-0-0 at linkstart: 23191852            | 
| bp.B_Hotel | check | error    | Corrupt                                                 | 
+------------+-------+----------+---------------------------------------------------------+
3 rows in set (1.19 sec)

Now, i'll really replace the myisam files ;-)
[12 Jan 2007 13:44] MySQL Verification Team
i repeated a corruption:

mysql> check table b_hotel extended;
+--------------+-------+----------+----------------------------+
| Table        | Op    | Msg_type | Msg_text                   |
+--------------+-------+----------+----------------------------+
| test.b_hotel | check | warning  | Table is marked as crashed |
| test.b_hotel | check | error    | Found 30525 keys of 30527  |
| test.b_hotel | check | error    | Corrupt                    |
+--------------+-------+----------+----------------------------+
3 rows in set (1.98 sec)

Will create a testcase and upload it later.
[18 Jan 2007 7:03] MySQL Verification Team
haven't been able to reproduce on 5.0.34/5.0.36 yet.  am waiting to hear from customer if they received corruptions on 5.0.33.  I had gotten 3 corruptions on 5.0.26 after many hours of tests.
[29 Jan 2007 12:58] MySQL Verification Team
customer cannot repeat these corruptions on 5.0.33