Bug #63553 | innodb table lost during alter table when rename fails because #sql*.frm exists | ||
---|---|---|---|
Submitted: | 3 Dec 2011 2:36 | Modified: | 30 Jan 2012 17:32 |
Reporter: | Mark Callaghan | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: InnoDB Plugin storage engine | Severity: | S2 (Serious) |
Version: | 5.1.52, 5.1.60 | OS: | Any |
Assigned to: | Assigned Account | CPU Architecture: | Any |
Tags: | alter, innodb, lost, table |
[3 Dec 2011 2:36]
Mark Callaghan
[3 Dec 2011 15:40]
Valeriy Kravchuk
Thank you for the bug report. The first problem (weak filename generation) should probably be handled separately as generic, non-InnoDB bug.
[3 Dec 2011 21:20]
Peter Laursen
Mark blogged about this too and in a comment to his blog I asked if not the server could catch the exception and simply generate a new tempfile name and try again. Mark (who knows server internals well - what I don't) replied that the code generating the table name and the code using it are far away from each other. I have difficulty accepting this as a valid reason for rejecting the idea. This report is about data loss. But I have seen (less serious) statement failures dozens of times (CREATE TEMPORARY TABLE, SELECT FROM VIEW, SELECT ...JOIN .., SELECT .. UNION ..) due to old garbage left behind in /tempdir after a server crash or incorrect server shutdown ('kill').
[10 Jan 2012 20:32]
Michael Golledge
Thinking that if the file generated is known to follow a particular pattern plus current_pid, then at process startup, before any queries have been processed, any file that in the temp dir that fits that pattern with the now current PID can be assumed to have been left over from some previous time when this process was running. So, another potential solution is to, at startup, clean the temp dir of previous leftovers.
[12 Jan 2012 17:49]
Hamilton Oh
Cleaning the temp dir of previous leftovers, as Michael suggests, might also help with this long-standing bug too: http://bugs.mysql.com/bug.php?id=45976
[27 Jan 2012 21:44]
Kevin Lewis
This bug does not fail as it is described. The original log shows that the rename failed after this log entry; 111202 15:34:46 InnoDB: Warning: problems renaming 'd1/xm' to 'd1/#sql2-3e21-41', 25001 iterations This failure to rename a tablespace file is documented in Bug#62100 and is actually being fixed by Inamm Rana. If that bug happens, there is a changes that an InnoDB tablespace file may not be renamed back to the original file name and thus be lost. Here is what happens; Assume that t1 is stored in a a file-per-table tablespace and it is altered; ALTER TABLE t1 engine=innodb; MySQL will first rename 'test/t1' to 'test/#sql2-nnn-nn', then rename the new altered table, 'test/#sql-nnn_nn' back to 'test/t1'. If that succeeds, it then deletes 'test/#sql2-nnn-nn'. So there are two renames. If the first one fails, then nothing is lost. 'test/t1' is still there. If the second one fails, the code in sql_table.cc, mysql_alter_table() will rename 'test/#sql2-nnn-nn' is back to 'test/t1'. But if the second rename fails after 25000 tries AND the recovery rename also fails after 25000 tries, then the 'test/t1' file will indeed be lost, or at least not visible to InnoDB or MySQL. It will look like a leftover temp file. I did this in the debugger in file_rename_tablespace(), forcing the execution into the if (count > 25000) { section for the second rename and the recovery rename. The result was mysql> show tables; Empty set (0.00 sec) and Directory of mysql-test\var\mysqld.1\data\test 01/27/2012 11:04 AM 8,578 #sql2-e64-1.frm 01/27/2012 11:04 AM 98,304 #sql2-e64-1.ibd Then I was able to stop the engine, rename these files back to t1.frm and t1.ibd, and when I restarted the engine t1 was once again visible and usable. In summary, it seems highly unlikely that this scenario would happen once Bug62100 is fixed. It cannot happen just by the existence of another leftover temp file that happens to be the same name as the current temp file name because the ALTER TABLE process starts with the real filename and tries to rename it to that existing temp file. If that fails the original file is still there. The failure described above is a failure to rename a temp altered file back to to original filename, followed by a failure to rename the original file with a temp name back to the original name. That could only happen if the original filename got duplicated right in the middle of this renaming, which is normally done very quickly, unless the Bug#62100 happens.
[27 Jan 2012 21:56]
Mark Callaghan
There are two parts to this bug. 1) the lost table that inspired me to look at this 2) the reproduction case that I provided While the bug that inspired me to look at this might be a duplicate with 62100, my test case shows that if os_file_rename fails when called by fil_rename_tablespace then the ibd file is lost. This outcome is worse then what happens for bug 62100 because in that case the ibd file is still there, it just has the wrong name.
[28 Jan 2012 20:07]
Mark Callaghan
Kevin - I was confused by what you wrote, but I think you mean that the ibd file isn't lost when row_rename_table_for_mysql is changed to not hide the error from dict_table_rename_in_cache. You are right about that.
[30 Jan 2012 14:40]
Kevin Lewis
Mark, I have tried to make a 'missing' file using your testcase described in this bug, but no files are missing or left named incorrectly. The testcase here changes the code so that fil_rename_tablespace() never succeeds. During an ALTER, it is first called to rename the old file to a temp fil of the form 'database/#sql2-nnn-nn'. If this fails, the old file is still there. The error is reported all the way back to mysql_rename_table() and then mysql_alter_table(). Everything is cleaned up. I do not see how it is possible to recreate a 'missing' tablespace by changing the code the way this testcase does. But if the bug in Bug#62100 is involved this could happen; 1) the first rename succeeds, making a file like this; 'database/#sql-nnn_nn.ibd', 2) the second rename fails, which would have renamed the altered table back to the original file name, followed by 3) a failure of the recovery rename, which would rename 'database/#sql2-nnn-nn.ibd' back to the original file name, then you might have a 'missing' file. I also looked into the possibility that a pre-existing 'database/#sql2-nnn-nn.frm' file would cause this. The problem is that the mysql function that renames FRM files, my_rename(), seems to overwrite existing files unless HAVE_FILE_VERSIONS is defined, but I cannot find any other references to that macro in the whole codebase. But even if the frm file was leftover and pre-existed, mysql_rename_table would see that error and would call InnoDB again to rename the 'database/#sql2-nnn-nn.frm' file back to the original file name. So I don't see how a missing file could happen once Bug#62100 is fixed. The logic requires at least 2 renames to fail.
[30 Jan 2012 17:31]
Mark Callaghan
I think we agree that this is a duplicate of 62100
[30 Jan 2012 17:32]
Valeriy Kravchuk
So, let's consider this a duplicate of bug #62100 for now.