| Bug #1649 | Race condition with Temporary tables during Alters | ||
|---|---|---|---|
| Submitted: | 24 Oct 2003 15:21 | Modified: | 1 Dec 2003 13:37 |
| Reporter: | Dathan Pattishall | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server | Severity: | S2 (Serious) |
| Version: | 4.0.15 | OS: | Linux (Linux ef248.friendfinderinc.com) |
| Assigned to: | CPU Architecture: | Any | |
[1 Dec 2003 13:21]
Dean Ellis
I cannot repeat this with the latest sources from the 4.0.17 BitKeeper repository. You might try testing it with a newer release. Thank you.
[1 Dec 2003 13:37]
Guilhem Bichot
Sorry Dean; I discussed this bug on mysql@lists and bugs@lists where it was originally reported, and did not notice that it had been entered here as well.
Below I paste the main part of the discussion. The conclusion can be summed up like this: "it's not a bug, but our doc should clearly state that it's not possible to have synonyms with symlinks" (I have fixed the doc since). Sorry again for making you waste some time :(
From: Guilhem Bichot <guilhem@mysql.com>
To: Dathan Vance Pattishall <dathan@friendfinderinc.com>
Cc: mysql@lists.mysql.com, bugs@lists.mysql.com, dlenev@mysql.com, sinisa@mysql.com
Subject: RE: reproducible error 17
Date: 28 Oct 2003 00:15:04 +0100
> -->Are you using symlinks to place tables or databases at other
> locations
> -->than the datadir?
> Yes, in the db directory under the datadir. I have tables which are one
> name symlinked to the tables which where altered. These fake tables or
> tables which are synonyms to the src tables did not have any actions
> performed on them at this moment in time.
>
> For example
>
> recent_members.frm -> /var/lib/mysql/ffadult_recent/recentmembers.frm
> recent_members.MYI -> /var/lib/mysql/ffadult_recent/recentmembers.MYI
> recent_members.MYD -> /var/lib/mysql/ffadult_recent/recentmembers.MYD
>
> The tables altered where recentmembers(_*)
I have run some tests and what triggers the problems is this symlinking.
If I understood you correctly, your setup is analogous to mine which is:
- the datadir of my MySQL server is /m/data/4/1/
- in this I have a database called test: /m/data/4/1/test/
- I have this in /m/data/4/1/test/ :
lrwxrwxrwx 1 guilhem qq 24 Oct 27 23:25 tbl_.frm ->
/m/data/4/1/test/tbl.frm
-rw-rw---- 1 guilhem qq 8620 Oct 27 23:30 tbl.frm
lrwxrwxrwx 1 guilhem qq 24 Oct 27 23:26 tbl_.MYD ->
/m/data/4/1/test/tbl.MYD
-rw-rw---- 1 guilhem qq 84 Oct 27 23:30 tbl.MYD
lrwxrwxrwx 1 guilhem qq 24 Oct 27 23:26 tbl_.MYI ->
/m/data/4/1/test/tbl.MYI
-rw-rw---- 1 guilhem qq 1024 Oct 27 23:30 tbl.MYI
("tbl_" is a "synonym" for the real "tbl" table).
On my master (no replication) I got:
MASTER> flush tables;
Query OK, 0 rows affected (0.00 sec)
MASTER> desc tbl;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| a | int(11) | YES | | NULL | |
| b | int(11) | YES | | NULL | |
| c | int(11) | YES | | NULL | |
| d | int(11) | YES | | NULL | |
| e | int(11) | YES | | NULL | |
| f | int(11) | YES | | NULL | |
| g | int(11) | YES | | NULL | |
| h | int(11) | YES | | NULL | |
| i | int(11) | YES | | NULL | |
| k | int(11) | YES | | NULL | |
| l | int(11) | YES | | NULL | |
| m | int(11) | YES | | NULL | |
| n | int(11) | YES | | NULL | |
| o | int(11) | YES | | NULL | |
| p | int(11) | YES | | NULL | |
| q | int(11) | YES | | NULL | |
| r | int(11) | YES | | NULL | |
| s | int(11) | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
18 rows in set (0.00 sec)
MASTER> flush tables;
Query OK, 0 rows affected (0.00 sec)
MASTER> alter table tbl drop s;
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
This show that altering the real table is ok.
Now we use the synonym (the symbolic link) instead:
MASTER> desc tbl_;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| a | int(11) | YES | | NULL | |
| b | int(11) | YES | | NULL | |
| c | int(11) | YES | | NULL | |
| d | int(11) | YES | | NULL | |
| e | int(11) | YES | | NULL | |
| f | int(11) | YES | | NULL | |
| g | int(11) | YES | | NULL | |
| h | int(11) | YES | | NULL | |
| i | int(11) | YES | | NULL | |
| k | int(11) | YES | | NULL | |
| l | int(11) | YES | | NULL | |
| m | int(11) | YES | | NULL | |
| n | int(11) | YES | | NULL | |
| o | int(11) | YES | | NULL | |
| p | int(11) | YES | | NULL | |
| q | int(11) | YES | | NULL | |
| r | int(11) | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
17 rows in set (0.00 sec)
MASTER> alter table tbl_ drop r;
ERROR 25: Can't create symlink './test/#sql-1aa7_4.MYI' pointing at
'/m/data/4/1/test/#sql-1aa7_4.MYI' (Error 17)
MASTER> alter table tbl_ drop r;
ERROR 25: Can't create symlink './test/#sql-1aa7_4.MYI' pointing at
'/m/data/4/1/test/#sql-1aa7_4.MYI' (Error 17)
This is the same error as you: it's impossible to alter the synonym.
What happens is that "ALTER TABLE tbl_ ...":
- Creates temporary files (named #sql*) in the same place as the tbl_
files, which is ./ (which is also /m/data/4/1/test btw). Those temporary
files are the result of the ALTER TABLE (the new frm/MYD/MYI files).
- Notices that the tbl_ files are in fact symlinks to files in
/m/data/4/1/test; so it wants to move the temporary files from ./ to
/m/data/4/1/test (ALTER wants, in some way, to preserve the symbolic
links). As the 2 directories are the same, the move fails.
Note that this error occured often on my system but not always,
depending on which table was accessed first (see below), which could
explain why you got it on slave only (on slave all queries are performed
by one unique thread).
Here is another problem:
MASTER> flush tables;
Query OK, 0 rows affected (0.00 sec)
MASTER> alter table tbl_ add t int;
ERROR 25: Can't create symlink './test/#sql-1aa7_4.MYI' pointing at '/m/data/4/1/test/#sql-1aa7_4.MYI' (Error 17)
MASTER> alter table tbl add t int;
ERROR 25: Can't create symlink './test/#sql-1aa7_4.MYI' pointing at '/m/data/4/1/test/#sql-1aa7_4.MYI' (Error 17)
you see, now even the alter on the real table fails.
This is because in the MySQL table cache, the real and synonym are
considered one same table; as MySQL first opened the synonym, it also
uses the synonym when the real table is invoked ("it's the same table"),
leading to the error for the real table too. If now I empty the table
cache:
MASTER> flush tables;
Query OK, 0 rows affected (0.00 sec)
MASTER> alter table tbl add t int;
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
it works.
So the conclusion is: unfortunately, the symlink support in MySQL was
not designed for "synonyming", as far as DDL (Data Definition Language -
CREATE TABLE / DROP TABLE / ALTER TABLE) commands are concerned. It was
designed with the thought that symlinks are to be used to point to a
*different* directory (another partition where there is more room, or
another device to balance disk load). For DDL commands MySQL always
expects a table to exist only once, i.e. to have only one name. Putting,
in the database directory, a symlink and the real table means giving 2
names to one table...
I will add a note about this into our manual soon. I understand this is
is an inconvenience for you; you will be safe if you always do the DDL
commands (ALTER TABLE, in your case) on the real table. It's ok to do
DML commands (INSERT/DELETE/UPDATE/LOADDATA, which fortunately occur
much more often than ALTER TABLE normally) on both tables indifferently.
Guilhem

Description: *************************** 1. row *************************** Master_Host: 10.2.12.224 Master_User: rep Master_Port: 3306 Connect_retry: 60 Master_Log_File: ef224-bin.020 Read_Master_Log_Pos: 409223612 Relay_Log_File: ef242-relay-bin.016 Relay_Log_Pos: 330063332 Relay_Master_Log_File: ef224-bin.020 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_do_db: Replicate_ignore_db: Last_errno: 25 Last_error: Error 'Can't create symlink './ff_recent/#sql-17f7_c.MYI' pointing at '/var/lib/mysql/ff_recent/#sql-17f7_c.MYI' (Error 17)' on query 'ALTER TABLE recentmembers_20 add have_children tinyint unsigned not NULL default 0'. Default database: 'ff_recent' Skip_counter: 0 Exec_master_log_pos: 348632736 Relay_log_space: 390689457 Perror says Error code 17: File exists I have 2 alters submitted on the master in sequential order. On the slave it executes in the same order but produces error 17. Somehow the tempfile already exists, but when I go to the datadir this temp file is not displayed in the directory. Performing a slave stop slave start does not work. Restarting the mysql server on the slave is the only method I found to recover from the error. Version of mySQL: 4.0.15 Kernel: Linux ef242.friendfinderinc.com 2.4.18-4smp #1 SMP Sun Apr 14 13:50:35 PDT 2002 i686 unknown *PATCHED* How to repeat: Apply 2 alters on a master in sequentail order across 53 tables with about 600K rows in each. Wait for the slave to throw the error that the temporary table exists and can't be symlinked Suggested fix: Don't symlink the tables unlink before next execution use a larger random temporary file name not based on the table name.