Bug #89487 ALTER TABLE hangs in "Waiting for tablespace metadata lock" state
Submitted: 31 Jan 2018 18:44 Modified: 20 Mar 2018 21:23
Reporter: Sveta Smirnova (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Locking Severity:S3 (Non-critical)
Version:8.0.4 OS:Any
Assigned to: CPU Architecture:Any
Tags: regression

[31 Jan 2018 18:44] Sveta Smirnova
Description:
ALTER TABLE hangs in "Waiting for tablespace metadata lock" state with version 8.0, but does not hangs with version 5.7

How to repeat:
Copy attached file to t subdirectory of mysql-test directory, then run

./mtr rpl_tablespace_metadata

After you ensure test is started (seeing something like "worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009") wait for about 1 minute and connect to the server:

mysql -h127.0.0.1 -P13000 -uroot

Then run SHOW PROCESSLIST

You will see something like:

mysql> show processlist;
+----+-----------------+-----------------+------+-------------+------+---------------------------------------------------------------+--------------------------------------------------+
| Id | User            | Host            | db   | Command     | Time | State                                                         | Info                                             |
+----+-----------------+-----------------+------+-------------+------+---------------------------------------------------------------+--------------------------------------------------+
|  4 | event_scheduler | localhost       | NULL | Daemon      |  842 | Waiting on empty queue                                        | NULL                                             |
|  8 | root            | localhost       | test | Sleep       |  841 |                                                               | NULL                                             |
| 10 | root            | localhost:53716 | test | Sleep       |  839 |                                                               | NULL                                             |
| 11 | root            | localhost:53718 | test | Sleep       |  841 |                                                               | NULL                                             |
| 18 | root            | localhost:53752 | NULL | Binlog Dump |  840 | Master has sent all binlog to slave; waiting for more updates | NULL                                             |
| 19 | root            | localhost:53780 | test | Sleep       |  839 |                                                               | NULL                                             |
| 20 | root            | localhost:53782 | test | Sleep       |  839 |                                                               | NULL                                             |
| 21 | root            | localhost       | test | Sleep       |  833 |                                                               | NULL                                             |
| 22 | root            | localhost       | test | Sleep       |  833 |                                                               | NULL                                             |
| 23 | root            | localhost       | test | Query       |  832 | Waiting for tablespace metadata lock                          | ALTER TABLE `test`.`test2` RENAME `test`.`test1` |
| 24 | root            | localhost       | test | Sleep       |  833 |                                                               | NULL                                             |
| 66 | root            | localhost:54236 | NULL | Query       |    0 | starting                                                      | show processlist                                 |
+----+-----------------+-----------------+------+-------------+------+---------------------------------------------------------------+--------------------------------------------------+
12 rows in set (0,00 sec)

Test will hang until mysqld is killed after 900 seconds timeout.

Then run same test with version 5.7. It will pass.

Suggested fix:
Don't hang
[31 Jan 2018 18:45] Sveta Smirnova
Maybe it better fits to "Locking" category
[31 Jan 2018 18:45] Sveta Smirnova
test case for MTR

Attachment: rpl_tablespace_metadata.test (application/octet-stream, text), 3.11 KiB.

[31 Jan 2018 18:53] Sveta Smirnova
Shorter test

Attachment: rpl_tablespace_metadata.test (application/octet-stream, text), 1.30 KiB.

[31 Jan 2018 18:54] Sveta Smirnova
If no slave connected no hang happens.
[31 Jan 2018 18:57] Sveta Smirnova
Repeatable with version 8.0.3, so this is not result of the fix of https://bugs.mysql.com/bug.php?id=79820
[31 Jan 2018 18:58] MySQL Verification Team
Hello Sveta,

Thank you for the report and test case.
With the provided test case(1st version) on 5.7 no issues observed but with same test case on 8.0.4 observing as reported.

Thanks,
Umesh
[2 Feb 2018 15:43] Dyre Tjeldvoll
Posted by developer:
 
Problem seems to be that the TABLESPACE MDLs set by LOCK TABLES are not transferred to the new implicit tablespace name in ALTER TABLE ... RENAME. Only the TABLE MDL is transferred to the new name, and this block another rename to the old name since this would try to get MDL on the old tablespace name.

Even simpler repro:

create table test1(
   a int not null,
   b char(2) not null,
   primary key(a, b)
) ENGINE=INNODB;

create table test2(
   a int not null,
   b char(2) not null,
   primary key(a, b)
) ENGINE=INNODB;

LOCK TABLES `test`.`test1` WRITE;
ALTER TABLE `test`.`test1` RENAME `test`.`test1_tmp_16810`;
# This will transfer the Table MDL set by LOCK TABLES to the new table, but
# the Tablespace MDL set on the implicit tablespace test/test1 is not transferred

--echo ### Locks held after ALTER TABLE test.test1 RENAME ###
SELECT * FROM performance_schema.metadata_locks
WHERE OBJECT_TYPE = 'table' AND OBJECT_SCHEMA = 'test' OR OBJECT_TYPE = 'tablespace';

# This will list the TABLE MDL on test.test1_tmp_16810 and the TABLESPACE MDL on test/test1

UNLOCK TABLES;
--echo ### Locks held after UNLOCK TABLES ###
SELECT * FROM performance_schema.metadata_locks
WHERE OBJECT_TYPE = 'table' AND OBJECT_SCHEMA = 'test' OR OBJECT_TYPE = 'tablespace';

# This will show that both MDLs are gone.

The challenge here is that ALTER TABLE ... RENAME TO ... needs to behave differently if the table resides in an implicit tablespace, since this is renamed along with the table. And the current data model for the DD does not make this information available to the server. It is only stored as part of se_private_data which the server layer should not look at.
[6 Feb 2018 8:17] Dyre Tjeldvoll
Posted by developer:
 
I believe this is a regression from when the naming of implicit tablespaces was changed from "innodb_file_per_table.<space_id>" to "<schemaname>/<tablename>". Previously the name would not change when the table was renamed, whereas with the new scheme a rename of the table (in an implicit tablespace) also implies a rename of the tablespace. But the handling of MDL set as part of LOCK TABLES was not changed, so the MDL set on the new tablespace name is incorrectly released when rename commits, and the the MDL on the old name is kept until UNLOCK TABLES is called.

Fix is probably to track the mdl request for the tablespace (as is done for the table mdl), check if the name changed as part of rename, and if so change the duration of the lock on the new name, and release the lock on the old name.
[20 Mar 2018 21:23] Paul DuBois
Posted by developer:
 
Fixed in 8.0.11.

ALTER TABLE could hang in a "Waiting for tablespace metadata lock"
state.