| Bug #10132 | Crashing the server on corrupt InnoDB page is unhelpful | ||
|---|---|---|---|
| Submitted: | 25 Apr 2005 5:00 | Modified: | 25 May 2005 21:25 |
| Reporter: | James Day | ||
| Status: | Verified | ||
| Category: | Server: InnoDB | Severity: | S4 (Feature request) |
| Version: | 5.1, 6.0 | OS: | Linux (Linux) |
| Assigned to: | Heikki Tuuri | Target Version: | TBD |
| Triage: | D5 (Feature request) | ||
[25 Apr 2005 5:00]
James Day
[25 Apr 2005 11:21]
Heikki Tuuri
James, nice meeting you at the UC 2005! Also an undo log page, or even worse, an allocation bitmap page can become corrupt. Then there is no associated table. I think InnoDB currently does not know for sure in which index of which table a page is, unless you use the innodb_file_per_table option. Marking a corrupt table 'crashed' like in MyISAM, is something we could possibly do. There is no builtin error return mechanism in InnoDB's B-tree operations. It is a fatal error if it encounters a corrupt page. But we could mark the table as 'crashed', and reboot mysqld, so that further queries would be blocked until the DBA marks the table as ok again. A couple of years ago, typical Linux file corruption was so severe that InnoDB would probably seg fault if it accessed a corrupt page at all. And often file corruption happens on many pages, so that crashes are inevitable. It may be that corruption is less severe nowadays. As a first step, we could change the InnoDB behavior and let it use a corrupt page. Hmm... then there is a risk of further corruption. That is bad :(. Note also that sometimes rebooting the computer fixes the file corruption, if it is only present in the file cache of the OS. We should then remove the 'crashed' mark from the table. As you see, changing the behavior from the current one has also some downsides. I think the MyISAM approach is the best. But we must let the DBA to remove the 'crashed' mark also without running CHECK TABLE, because that can take hours. The next question is where we can store that 'crashed' bit. The .frm file would be an easy place. Another possibility is the InnoDB internal data dictionary. If the bit were set, InnoDB would refuse to open the table at all, unless innodb_force_recovery is used. The plan: 1. Find out if we can always recognize what table a corrupt page belongs to. If innodb_file_per_table is used, at least then we know it for sure. 2. Put a bit in the InnoDB internal data dictionary, to mark the corruption of a table. 3. Refuse to open a table where the bit is set. 4. Give the DBA some SQL command to erase that bit. This is something for 5.1 or 5.2. Regards, Heikki
[26 Apr 2005 0:24]
James Day
Was good to meet you and others there as well - lots of productive discussions! Was interesting to note how many people were recommending InnoDB as the engine of choice. That's what I did as well, for anyone with reliability and availability needs, when asked during the Wikipedia presentation. The handling of corrupt pages is the biggest problem with that, which is why you're reading this report.:) I like your use of the word "currently" for not knowing which table is affected.:) My first reaction was telling the B-tree engine the type of page, database and table it is being asked for. Then it can report and may have ways to attempt a partial repair or crash-proof but "empty" and marked don't save page return, sufficient to allow data to be dumped. Sadly, that is sure to be desirable for some users, who can be expected not to have a backup or binlog. I wonder if it is possible to read enough data to give a good chance of purging the OS cache before trying another read of the page? Still need to tell the DBA but it might work tolerably well in a bad situation. Of course, people would inevitably start reporting "InnoDB is slow" bugs because of this, without fixing the real cause.:( The plan seems good. Very big improvement over current version, when bad things are happening.
