Bug #8306 TRUNCATE leads to index corruption
Submitted: 4 Feb 2005 0:57 Modified: 7 Mar 2005 0:54
Reporter: Matthew Lord Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:4.1.9 OS:Solaris (solaris 9 x86)
Assigned to: Ingo Strüwing CPU Architecture:Any

[4 Feb 2005 0:57] Matthew Lord
Description:

mysql> load data local infile "/export/home/mysqldev/mlord/mysql-standard-4.1.9-pc-solaris2.9-i386/data.txt.bak" ignore into table utts_20050131_M03 (uuid, log_event_id);
ERROR 1034 (HY000): Incorrect key file for table 'utts_20050131_M03'; try to repair it

How to repeat:
Test environmet is all set up on sol9x86.

cd /export/home/mysqldev/mlord/mysql-standard-4.1.9-pc-solaris2.9-i386
./bin/mysqld_safe &
./bin/mysql -u root
truncate table utts_20050131_M03;
load data local infile "/export/home/mysqldev/mlord/mysql-standard-4.1.9-pc-solaris2.9-i386/data.txt.bak" ignore into table utts_20050131_M03 (uuid, log_event_id);

Suggested fix:
Find out where corruption is coming from.
[5 Feb 2005 13:55] Aleksey Kishkin
mysql> truncate table utts_20050131_M03;
Query OK, 0 rows affected (0.00 sec)

mysql> load data local infile "/export/home/mysqldev/mlord/mysql-standard-4.1.9-pc-solaris2.9-i386/data.txt.bak" ignore into table utts_20050131_M03 (uuid, log_event_id);
Query OK, 0 rows affected, 1 warning (3.20 sec)
Records: 1000000  Deleted: 0  Skipped: 1000000  Warnings: 1

mysql> show warnings;
+---------+------+---------------------------------------------------+
| Level   | Code | Message                                           |
+---------+------+---------------------------------------------------+
| Warning | 1265 | Data truncated for column 'log_event_id' at row 1 |
+---------+------+---------------------------------------------------+
1 row in set (0.00 sec)

mysql> select * from utts_20050131_M03;
ERROR 1030 (HY000): Got error 127 from storage engine

It seems like another message, not original from bug description;

But when I duplicate of that  table (just feed to mysql output of 'show create table utts_20050131_M03' with anouther table name), 
and truncate/load data infile/select * from duplicatetable -- it works without problem. I got 1 record.
[7 Feb 2005 12:05] Ingo Strüwing
Matt, can you please verify that it is the TRUNCATE which disturbes here?

When I REPAIR TABLE utts_20050131_M03; after a TRUNCATE, I get:
+------------------------+--------+----------+------------------------------------+
| Table                  | Op     | Msg_type | Msg_text                           |
+------------------------+--------+----------+------------------------------------+
| utts.utts_20050131_M03 | repair | warning  | Number of rows changed from 1 to 0 |
| utts.utts_20050131_M03 | repair | status   | OK                                 |
+------------------------+--------+----------+------------------------------------+
2 rows in set (0.01 sec)

After this, LOAD TABLE succeeds (more or less. Since you don't load the primary key,
only one row is accepted). After the LOAD, the table seems to be fully usable. A REPAIR
just says OK.

I played with the commands and got the impression that the table gets corrupted only
after a TRUNCATE after a LOAD or after a TRUNCATE after an INSERT. That is, a
TRUNCATE of an non-empty table. This does not happen on Linux/x86. The mysqld
in your environment is not a debug version, so I cannot analyze it further, currently.

Please verify my observations. If you agree, please close this bug and report a new one.
Please supply a debug server on sol9x86 then and suggest me for resolving the new bug.
[7 Feb 2005 21:14] Matthew Lord
Ingo,

No the truncation is not the problem because the customer is not truncating the table.  You can
look at the support issue for more information on their setup.

I created a new test file that shows the problem every time.  It does seem to only happen with
regularity on the newly created table (after doing a truncate).

This is the new SQL command to use:
load data local infile "/export/home/mysqldev/mlord/mysql-standard-4.1.9-pc-solaris2.9-i386/data.txt" ignore into table utts_20050131_M03 (uuid, log_event_id, utt_call_id);

Best Regards,

-matt-
[8 Feb 2005 8:32] Ingo Strüwing
This morning, the database was still running. So I just started "mysql -u root" and used the commands from the history. Everything worked flawlessly. Even _with_ truncate.
I stopped and started the database again, just to be sure that I used the correct binary. And indeed, this made the problems to recur. But, whatever I tried, the problem exists only after a truncate. So I avoided the truncate and used "delete from utts_20050131_M03;" instead. I tried several combinations of shutdown, restart, delete and load. Everything worked very well.
So I set the bug to "can't repeat". If you find a way to reproduce it without using "truncate", you may reopen it. Unfortunately, I have problems to access Eventum, currently. So I cannot study the original issue. I hope I will find a way around this though.
[9 Feb 2005 17:56] Ingo Strüwing
My findings for today are the following. I could reproduce the table corruption with the following small test:

1.) Preparation
        ./bin/mysqld_safe &
        ./bin/mysql -u root
                create database mysql_test1;
                use mysql_test1
                create table t1 (c1 int, index(c1));
                create table t2 (c1 int, index(c1)) engine=merge union=(t1);
                insert into t1 values (1);
                exit

2.) Loop forever
        ./bin/mysqladmin -u root shutdown
        ./bin/mysqld_safe &
        ./bin/mysql -u root mysql_test1
                truncate table t1;
                        # The TRUNCATE crashes the table
                insert into t1 values (1);
                        # The INSERT fails on the crashed table
                repair table t1;
                insert into t1 values (1);
                exit

Sometimes the restart of the server is not necessary and sometimes it is.
The problem is not repeatable on my Linux box.
The problem is not repeatable, if 'mysql' is started with the -A option.

Without the -A option, mysql does the following on startup:
- SHOW DATABASES;
- SHOW TABLES;
- Retrieve the field list for every table in the current database.

Using -A and sending the first two commands does not repeat the bug. So the problem lies in the field list retrieval. But it happens only if a merge table with index(es) is present and a truncate is done on a sub-table of a merge table.
[14 Feb 2005 21:35] Ingo Strüwing
The field list retrieval is not guilty. I could achieve the same result with using the mysql -A option. And I could reproduce this on my Linux PC. Using the same preparation as before I did:

2.) Loop forever
        ./bin/mysqladmin -u root shutdown
        ./bin/mysqld_safe &
        ./bin/mysql -u root -A mysql_test1
                select * from t2;
                        # The SELECT opens the MERGE table and implicitly the MyISAM table.
                truncate table t1;
                        # The TRUNCATE crashes the table
                insert into t1 values (1);
                        # The INSERT fails on the crashed table
                repair table t1;
                insert into t1 values (1);
                exit

That is, I replaced the implicit field list retrieval by a simple access of the MERGE table. The problem is that TRUNCATE looks if the table is open and closes it, if necessary. But here we have the MERGE table open. The implicit open of the sub-table is invisible to TRUNCATE. Hence, it modifies the tables files while it is still open. The manual warns about this:

-- begin quote --
14.2.1. MERGE Table Problems
...
 You can't use DROP TABLE, ALTER TABLE, DELETE FROM without a WHERE clause, REPAIR TABLE, TRUNCATE TABLE, OPTIMIZE TABLE, or ANALYZE TABLE  on any of the tables that are mapped into a MERGE table that is ``open.'' If you do this, the MERGE table may still refer to the original table and you get unexpected results. The easiest way to work around this deficiency is to issue a FLUSH TABLES statement to ensure that no MERGE  tables remain ``open.''
...
-- end quote --

The question is, if a table corruption is included in "unexpected results". There may be a chance to let TRUNCATE fail if the table is open, be it explicit or implicit via a MERGE table. I did not look at the other  above mentioned commands though. I ask for advice, how to proceed.
[16 Feb 2005 15:11] Ingo Strüwing
Lacking advice, I made a patch to stop a TRUNCATE, if it tries to re-create a table, which is already open. This will give a weird error message. But it should match the notion of "unexpected results" better than a table corruption.

The changeset is:
bk commit - 4.1 tree (ingo:1.2175) BUG#8306
Date: 	Wed, 16 Feb 2005 16:05:04 +0100
[24 Feb 2005 10:29] Alexander Barkov
The patch looks fine, but a test case is missing.
I suggest to try to cover the problem in the test suite.
[24 Feb 2005 16:23] Ingo Strüwing
The same patch, but with a test case is now available as:
bk commit - 4.1 tree (ingo:1.2190) BUG#8306
Date: 	Thu, 24 Feb 2005 17:07:50 +0100
[25 Feb 2005 9:40] Ingo Strüwing
A new patch, replacing the old one. 
Improved error message. Additional proof that TRUNCATE works after FLUSH TABLES.
bk commit - 4.1 tree (ingo:1.2190) BUG#8306
Date: 	Fri, 25 Feb 2005 10:10:26 +0100
[2 Mar 2005 10:26] Ingo Strüwing
Fixed in 4.1.11.
[7 Mar 2005 0:54] Paul DuBois
Noted in 4.1.11, 5.0.3 changelogs.