Bug #46592 Partition corruption leads to entire table corruption
Submitted: 6 Aug 2009 22:59 Modified: 7 Aug 2009 18:47
Reporter: bhushan uparkar Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Partitions Severity:S2 (Serious)
Version:5.1.30 OS:Linux (Debian Linux 2.6.24-etchnhalf.1-amd64)
Assigned to: CPU Architecture:Any
Tags: Partition corruption table innodb

[6 Aug 2009 22:59] bhushan uparkar
Description:
We use range partitioned tables with innodb engine in dataware house application, and use mysql  Ver 14.14 Distrib 5.1.30 version. Also we use innodb file per table option , which results in file per partition. Recently while performing archiving of partition I run into corrupt partition , then I tried to dump the partition data but had no success. Then I tried to drop the partition again no success ,  and after few minutes I received the error like table does not exists . Here is the mysql error log,

==============================
090805 20:02:08 [ERROR] Failed to open table admob_dm_rep/fact_cpm_day#P#p20070215 after 10 attemtps.

090805 20:02:08 [ERROR] Cannot find or open table admob_dm_rep/fact_cpm_day#P#p20070215 from
the internal data dictionary of InnoDB though the .frm file for the
table exists. Maybe you have deleted and recreated InnoDB data
files but have forgotten to delete the corresponding .frm files
of InnoDB tables, or you have moved .frm files to another database?
or, the table contains indexes that this version of the engine
doesn't support.
See http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
how you can resolve the problem.

090805 22:08:28  InnoDB: Error: table `admob_dm_rep`.090805 22:08:28 [ERROR] Invalid (old?) table or database name 'fact_cpm_day#P#p20070215'
`#mysql50#fact_cpm_day#P#p20070215` does not exist in the InnoDB internal
InnoDB: data dictionary though MySQL is trying to drop it.
InnoDB: Have you copied the .frm file of the table to the
InnoDB: MySQL database directory from another database?
InnoDB: You can look for further help from
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
090805 22:36:45 [Warning] Aborted connection 112562 to db: 'admob_dm_rep' user: 'dbbackup' host: 'localhost' (Got timeout reading communication packets)
==========================

 Ideally I would like following things,
 1. Ability to tell corrupted partition in a table
 2. If one or few partitions are corrupted then return some kind of error informing partition corruption issue.
 3. Do not mark entire table as corrupted as it leads to lot of downtime as well requires lots of resources.
 4. Ability to drop the corrupt partitions  

How to repeat:
Have a range partition table with innodb engine, move one of the partition file [TAB_NAME#P#pXXXXXXXX.ibd ] from default location to '/tmp/' or some where else. And now try to query the table for data belonging to other partition, you will get following error,

{{{
# Please see enclosed attachment for table definition
# 
$ mv fact_cpc_day_ad_hset#P#p20070318.ibd /var/dump/
#

mysql> select count(universal_date_key ) from fact_cpc_day_ad_hset where universal_date_key >= 20070319 and universal_date_key <  20070320;
 ERROR 1146 (42S02): Table 'admob_dm_rep.fact_cpc_day_ad_hset' doesn't exist

}}}

Suggested fix:
 1. Partition corruption will happen because of bad hardware or similar issues , but one needs ability to identify such corrupted partitions and mark them corrupted in information schema . 
 2. Queries referring to corrupt partitions should inform about the corruption issue as well. But queries which do not depend on these corrupt partition should work regularly.
 3. One should be able to either drop/repair corrupted partitions without rebuilding entire table. As in data warehouse applications you are looking tables with few hundred GB worth of data.
[6 Aug 2009 23:00] bhushan uparkar
Table defination for cr_fact_cpc_day_ad_hset.sql

Attachment: cr_fact_cpc_day_ad_hset.sql (application/octet-stream, text), 63.91 KiB.

[7 Aug 2009 0:26] MySQL Verification Team
Thank you for the bug report. When you query the table aren't you getting an error message in your err-log file like below?

090806 21:22:44 [ERROR] MySQL is trying to open a table handle but the .ibd file for
table d0/fact_cpc_day_ad_hset#p#p20090925 does not exist.
Have you deleted the .ibd file from the database directory under
the MySQL datadir, or have you used DISCARD TABLESPACE?
See http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
how you can resolve the problem.
[7 Aug 2009 0:40] bhushan uparkar
I do get error in error log about missing ibd file,

==================
090806 15:38:29  InnoDB: error: space object of table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070318,
InnoDB: space id 1120 did not exist in memory. Retrying an open.
090806 15:38:29  InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
090806 15:38:29  InnoDB: Error: trying to open a table, but could not
InnoDB: open the tablespace file './admob_dm_rep/fact_cpc_day_ad_hset#P#p20070318.ibd'!
InnoDB: Have you moved InnoDB .ibd files around without using the
InnoDB: commands DISCARD TABLESPACE and IMPORT TABLESPACE?
InnoDB: It is also possible that this is a temporary table #sql...,
InnoDB: and MySQL removed the .ibd file for this.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
InnoDB: for how to resolve the issue.
090806 15:38:29  InnoDB: cannot calculate statistics for table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070318
InnoDB: because the .ibd file is missing.  For help, please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
090806 15:38:29 [ERROR] MySQL is trying to open a table handle but the .ibd file for
table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070318 does not exist.
Have you deleted the .ibd file from the database directory under
the MySQL datadir, or have you used DISCARD TABLESPACE?
See http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
how you can resolve the problem.
=========================

But my point is just because one partition is corrupted , dont mark entire table corrupt. Provide some way to deal with these kind of corrupt partitions  so we can recover only these one.
[7 Aug 2009 7:50] Sveta Smirnova
Thank you for the report.

I can not repeat described behavior with current development sources while bug is repeatable with version 5.1.30. please upgrade to current version 5.1.37
[7 Aug 2009 18:43] bhushan uparkar
Hello,

I hit one more partition corruption issue yesterday i.e. on 8/7/2009 , here also one of the partition got corrupted and end result the whole table is marked as corrupted. In this case the table is partitioned table with around 950 partitions and together the size is about 200GB. Here is the error log from mysql,

=========================================
cdroot@:~# cd /mnt/data1/mysql/
root@:mysql# ls -lt *err
-rw-rw---- 1 mysql mysql 25978 2009-08-07 14:21 dmdb901.err
root@:mysql# ls -lt *err*
-rw-rw---- 1 mysql mysql     25978 2009-08-07 14:21 dmdb901.err
-rw-rw---- 1 mysql mysql 164260510 2009-08-07 05:07 dmdb901.err-old
root@:mysql# cp dmdb901.err-old ~dbbackup/dmdb901.err-old_2009_08_07
root@:mysql# less dmdb901.err-old 
090807  1:58:21 [ERROR] Failed to open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 after 10 attemtps.

090807  1:58:21 [ERROR] Cannot find or open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 from
the internal data dictionary of InnoDB though the .frm file for the
table exists. Maybe you have deleted and recreated InnoDB data
files but have forgotten to delete the corresponding .frm files
of InnoDB tables, or you have moved .frm files to another database?
or, the table contains indexes that this version of the engine
doesn't support.
See http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
how you can resolve the problem.

090807  1:58:24 [ERROR] Failed to open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 after 10 attemtps.

090807  1:58:24 [ERROR] Cannot find or open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 from
the internal data dictionary of InnoDB though the .frm file for the
table exists. Maybe you have deleted and recreated InnoDB data
files but have forgotten to delete the corresponding .frm files
of InnoDB tables, or you have moved .frm files to another database?
or, the table contains indexes that this version of the engine
doesn't support.
See http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
how you can resolve the problem.

090807  1:58:24 [ERROR] Slave SQL: Error 'Table 'admob_dm_rep.fact_cpc_day_ad_hset' doesn't exist' on opening tables, Error_code: 1146
090807  1:58:24 [Warning] Slave: Table 'admob_dm_rep.fact_cpc_day_ad_hset' doesn't exist Error_code: 1146
090807  1:58:24 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.033795' position 7658511
090807  2:05:38 [ERROR] Failed to open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 after 10 attemtps.

090807  2:05:38 [ERROR] Cannot find or open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 from
the internal data dictionary of InnoDB though the .frm file for the
table exists. Maybe you have deleted and recreated InnoDB data
files but have forgotten to delete the corresponding .frm files
of InnoDB tables, or you have moved .frm files to another database?
or, the table contains indexes that this version of the engine
doesn't support.
See http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
how you can resolve the problem.

090807  2:05:54 [ERROR] Failed to open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 after 10 attemtps.

090807  2:05:54 [ERROR] Cannot find or open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 from
the internal data dictionary of InnoDB though the .frm file for the
table exists. Maybe you have deleted and recreated InnoDB data
files but have forgotten to delete the corresponding .frm files
of InnoDB tables, or you have moved .frm files to another database?
or, the table contains indexes that this version of the engine
doesn't support.
See http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
how you can resolve the problem.

090807  3:10:12 [ERROR] Failed to open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 after 10 attemtps.

090807  3:10:12 [ERROR] Cannot find or open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 from
the internal data dictionary of InnoDB though the .frm file for the
table exists. Maybe you have deleted and recreated InnoDB data
files but have forgotten to delete the corresponding .frm files
of InnoDB tables, or you have moved .frm files to another database?
or, the table contains indexes that this version of the engine
doesn't support.
See http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
how you can resolve the problem.

090807  3:10:29 [ERROR] Failed to open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 after 10 attemtps.

090807  3:10:29 [ERROR] Cannot find or open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 from
the internal data dictionary of InnoDB though the .frm file for the
table exists. Maybe you have deleted and recreated InnoDB data
files but have forgotten to delete the corresponding .frm files
of InnoDB tables, or you have moved .frm files to another database?
or, the table contains indexes that this version of the engine
doesn't support.
See http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
how you can resolve the problem.

090807  3:10:34 [ERROR] Failed to open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 after 10 attemtps.

090807  3:10:34 [ERROR] Cannot find or open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 from
the internal data dictionary of InnoDB though the .frm file for the
table exists. Maybe you have deleted and recreated InnoDB data
files but have forgotten to delete the corresponding .frm files
of InnoDB tables, or you have moved .frm files to another database?
or, the table contains indexes that this version of the engine
doesn't support.
See http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
how you can resolve the problem.

090807  3:17:57 [ERROR] Failed to open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 after 10 attemtps.

090807  3:17:57 [ERROR] Cannot find or open table admob_dm_rep/fact_cpc_day_ad_hset#P#p20070120 from
the internal data dictionary of InnoDB though the .frm file for the
table exists. Maybe you have deleted and recreated InnoDB data
files but have forgotten to delete the corresponding .frm files
of InnoDB tables, or you have moved .frm files to another database?
or, the table contains indexes that this version of the engine
doesn't support.
See http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
how you can resolve the problem.

====================================

 This is second time in 2 days in run into similar bug. I dont know how to simulate partition corruption issue, but the method I suggested is useless as well.  
 There are few points worth mentioning,

1. If one of the partition is corrupted , then Innodb marks entire table corrupted. There is little or no support for partitioning in innodb when one of the partition gets corrupted.  
2. In this particular case I was planning to take partition dump using mysqldump, at which point the partition corruption is discovered. And after which the whole table is marked as corrupted. Here is what I see from mysql prompt,

===============================
mysql> alter table fact_cpc_day_ad_hset CHECK PARTITION p20070318;
+-----------------------------------+-------+----------+---------------------------------------------------------+
| Table                             | Op    | Msg_type | Msg_text                                                |
+-----------------------------------+-------+----------+---------------------------------------------------------+
| admob_dm_rep.fact_cpc_day_ad_hset | check | Error    | Table 'admob_dm_rep.fact_cpc_day_ad_hset' doesn't exist | 
| admob_dm_rep.fact_cpc_day_ad_hset | check | status   | Operation failed                                        | 
+-----------------------------------+-------+----------+---------------------------------------------------------+
[7 Aug 2009 18:47] bhushan uparkar
Here are ibd files for the partitioned table , where you see the corrupted partition file as well.

====================
root@bhushandev:/var/lib/mysql/admob_dm_rep# ls -lt fact_cpc_day_ad_hset#P#p200703*ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070322.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070323.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070324.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070325.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070326.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070327.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070328.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070329.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070330.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070331.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070301.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070302.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070303.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070304.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070305.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070306.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070307.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070308.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070309.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070310.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070311.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070312.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070313.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070314.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070315.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070316.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070317.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070318.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070319.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070320.ibd
-rw-rw---- 1 mysql mysql 393216 2009-08-05 21:36 fact_cpc_day_ad_hset#P#p20070321.ibd

==========================================
These listed partitions do not have any data as matter of fact, and still it ends being corrupted.

Let me know if you need additional information on this.