| Bug #41609 | Crash recovery does not work for InnoDB temporary tables | ||
|---|---|---|---|
| Submitted: | 19 Dec 2008 0:00 | Modified: | 7 Jul 8:28 |
| Reporter: | Mark Callaghan | ||
| Status: | Verified | ||
| Category: | Server: InnoDB | Severity: | S2 (Serious) |
| Version: | 5.0.67,5.1.30 | OS: | Any |
| Assigned to: | Marko Mäkelä | Target Version: | 5.1+ |
| Tags: | innodb, temporary, table, ibbackup, recovery, create | ||
| Triage: | Triaged: D3 (Medium) / R2 (Low) / E2 (Low) | ||
[19 Dec 2008 0:31]
Gary Pendergast
Crash recovery problem confirmed on: 5.0.74 5.1.30
[19 Dec 2008 1:06]
Mark Callaghan
Actually,the workaround is to use skip_innodb_file_per_table
[19 Dec 2008 17:13]
Michael Izioumtchenko
Assigning to Inaam tentatively. Note that the hotbackup issues is known and I think there's a bug number for that. The workaround for HotBackup should be creating the directory with the long strange looking name, before the restore, e.g. /var/lib/mysql/export/hda3/tmp status 'Need Feedback': could you provide us with the exact steps to reproduce this without involving HotBackup? My impression is that if we just restart the server everything should work since the directories names are all the same.
[19 Dec 2008 18:11]
Mark Callaghan
To reproduce: * configure server with innodb_file_per_table and datadir set * execute sql 'create temporary table tt(i int) engine=innodb; commit' * kill mysqld * look at error log When innodb_file_per_table is set there will be error messages in the error log. When it is not set there will not be error messages. But the more serious problem is to construct a case where the MLOG_FILE_CREATE log entry is replayed during crash recovery. If you can construct that case, then I think the server is dead as it will crash on an assert failure in file_op_log_parse_or_replay() when file_create_new_single_table_tablespace() fails. Alas, I was not able to construct that case using the test above. Creating missing directories does not work as the bogus names have an extra '/' /var/lib/mysql//export/hda3/...
[19 Dec 2008 19:44]
Michael Izioumtchenko
How does an extra // hurt? It's supposed to be treated as a single / by anything that work with files.
[19 Dec 2008 19:56]
Mark Callaghan
It doesn't hurt. That was a mistake by me. Your workaround works.
[22 Dec 2008 11:04]
Marko Mäkelä
The double // or \\ could have a special meaning on Microsoft Windows. On Unix-like systems, multiple successive / should not hurt.
[5 Mar 14:07]
Heikki Tuuri
Pekka, ibbackup should skip .ibd files it does not find. --Heikki
[5 Mar 14:08]
Heikki Tuuri
...and InnoDB itself should skip .ibd files it does not find.
[5 Mar 14:09]
Heikki Tuuri
Workaround: use MyISAM tables as temp tables.
[5 Mar 14:17]
Heikki Tuuri
Marko wants to take care of this.
[28 Apr 13:42]
Marko Mäkelä
It turns out that we can pass the is_temp flag in the page_no parameter to mlog_write_initial_log_record_for_file_op(). Luckily, it looks like the page_no has been passed as 0 ever since MySQL 4.1. Old versions of InnoDB would ignore the page_no parameter, and new ones would assign a bit of page_no to the is_temp flag.
[29 Apr 13:27]
Marko Mäkelä
As far as I can tell, the redo log entries for file operations are never replayed in InnoDB crash recovery. They are replayed in InnoDB Hot Backup, and in ibbackup --apply-log, it makes sense to skip the replaying when the is_temp flag is set. The error messages about lost *.ibd files for temporary tables are a different matter. I haven't investigated the fix yet. One possibility would be to store the is_temp flag in the InnoDB data dictionary, if we can find a good way of doing it. At least for ROW_FORMAT=COMPACT (the default format), it should be doable. Old versions of InnoDB didn't initialize unused fields to 0, and fields that could contain random garbage cannot be used for new flags.
[14 Jul 17:29]
Heikki Tuuri
The reason why this works in most MySQL installations may be that the temp dir is normally /tmp. Then InnoDB replays temp table operations in a 'tmp' subdir that it creates under the MySQL datadir. That is also a workaround: set /tmp as your temp dir for mysqld. The root cause of these temp table recovery problems is that MySQL puts temp tables to a temp dir of the computer, not to a database directory. Backup programs have trouble finding the temp tables from the temp dir of the computer.
[3 Sep 15:42]
Marko Mäkelä
The fix that I described on [28 Apr 13:42] is included in InnoDB Plugin 1.0.4. Part of this bug remains unfixed. Quoting my comment from [29 Apr 13:27]: The error messages about lost *.ibd files for temporary tables are a different matter. I haven't investigated the fix yet. One possibility would be to store the is_temp flag in the InnoDB data dictionary, if we can find a good way of doing it. At least for ROW_FORMAT=COMPACT (the default format), it should be doable. Old versions of InnoDB didn't initialize unused fields to 0, and fields that could contain random garbage cannot be used for new flags.

Description: I don't think that InnoDB crash recovery code doesn't work for temporary tables when innodb_file_per_table and datadir are set. I think that ibbackup has the same problem, but I don't expect you to fix that. We noticed the problem when ibbackup restores failed with error messages like: InnoDB: Doing recovery: scanned up to log sequence number 77 2578521600 081218 8:11:09 InnoDB: Operating system error number 2 in a file operation. InnoDB: The error means the system cannot find the path specified. InnoDB: File name /var/lib/mysql//export/hda3/tmp/#sql14c7_a9fa97_0.ibd InnoDB: File operation call: 'create'. InnoDB: Cannot continue operation. The pathname is odd: /var/lib/mysql//export/hda3/tmp/#sql14c7_a9fa97_0.ibd '/var/lib/mysql/' == datadir from my.cnf '/export/hda3/tmp/#sql14c7_a9fa97_0.ibd' is the pathname for the temp table. The problem appears to be this code in fil_op_log_parse_or_replay that is run to replay file creation operations. The third argument in the call to fil_create_new_single_table_tablespace is FALSE. That is the 'is_temp' arg, so recovery is always done as if the file is for a non-temporary table. Unfortunately, fil_create_new_single_table_tablespace calls fil_make_ibd_name and passes on the value of 'is_temp' } else { ut_a(type == MLOG_FILE_CREATE); if (fil_tablespace_exists_in_mem(space_id)) { /* Do nothing */ } else if (fil_get_space_id_for_table(name) != ULINT_UNDEFINED) { /* Do nothing */ } else { /* Create the database directory for name, if it does not exist yet */ fil_create_directory_for_tablename(name); ut_a(space_id != 0); ut_a(DB_SUCCESS == fil_create_new_single_table_tablespace( &space_id, name, FALSE, FIL_IBD_FILE_INITIAL_SIZE)); And file_make_ibd_name has this code that will prepend the value of datadir to the pathname written to the log entry which explains the pathname I listed above. I think this is also a problem for ibbackup and the storage engine. The default value of fil_path_to_mysql_datadir is '.' and can be changed by datadir. For http://bugs.mysql.com/bug.php?id=20867 the bad pathname starts with './tmp/' if (is_temp) { memcpy(filename, name, namelen); memcpy(filename + namelen, ".ibd", sizeof ".ibd"); } else { memcpy(filename, fil_path_to_mysql_datadir, dirlen); filename[dirlen] = '/'; memcpy(filename + dirlen + 1, name, namelen); memcpy(filename + dirlen + namelen + 1, ".ibd", sizeof ".ibd"); } Some of this is evident in http://bugs.mysql.com/bug.php?id=20867. Look at the pathname for the tmp table. It starts with '.': InnoDB: open the tablespace file './tmp/@0023sql14dc_1607_0.ibd How to repeat: It is easy to reproduce this with ibbackup. While ibbackup is running, create InnoDB temporary tables. It is harder to reproduce without ibbackup. I create an InnoDB temp table and then kill mysqld. During crash recovery an error message is printed with the incorrect pathname for the temp table (starts with '.') datadir and innodb_file_per_table should be set to reproduce this. But I think that someone from InnoDB can save you some time because the problem looks obvious to me from reading the code. Suggested fix: Crash recovery code must determine whether the MLOG_FILE_CREATE log entry is for a temporary or non-temporary table.