Bug #102715 InnoDB Discard tablespace is not crash safe
Submitted: 24 Feb 2021 3:20 Modified: 9 Mar 2021 2:57
Reporter: Shaohua Wang (OCA) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[24 Feb 2021 3:20] Shaohua Wang
Description:
When discard tablespace, we call row_mysql_table_id_reassign().
 /* Assign a new space ID to the table definition so that purge
  can ignore the changes. Update the system table on disk. */

But we don't update the system table in 8.0. we do the update in 5.7. So purge will still work, it may cause crash!

we call row_import_update_discarded_flag() in 5.7, but we don't do this in 8.0. If server crashes after dicard tablespae(e.g. during exporing tablespace), it may crash when accessing the table.

How to repeat:
code inspection.
[25 Feb 2021 13:01] MySQL Verification Team
Hi Mr. Wang,

To help us act on this, please show us steps to reproduce with the latest MySQL 8.0.23.  Thanks for testing MySQL.
[26 Feb 2021 4:34] Kevin Lewis
The clainm that "we don't update the system table in 8.0" is not true.  It is, of course, not done in row_import_update_discarded_flag() since that 5.7 routine updated the old InnoDB system tables.  

The new global DD is updated by:

handler::ha_discard_or_import_tablespace() -> 
ha_innobase::discard_or_import_tablespace() ->
dd_table_discard_tablespace() ->
dd_tablespace_set_state()

The purge system cannot crash on buffer pages from this old discarded tablespace.  Before 8.0.23, the pages were flushed from the buffer pool in the call to:

handler::ha_discard_or_import_tablespace() -> 
ha_innobase::discard_or_import_tablespace() ->
row_discard_tablespace_for_mysql() ->
row_discard_tablespace() ->
fil_discard_tablespace() ->
fil_delete_tablespace() ->
Fil_shard::space_delete()

After 8.0.23, the routine Fil_shard::space_delete() will call fil_space_t::set_deleted() which bumps an internal version number for that tablespace and leaves those pages in cache.  Then those buffers are lazily reused as they get old in the LRU list.  The purge thread sees that the version on those pages is older than the current version and ignores them.  This work was done in WL14100.

So DISCARD TABLESPACE should be crash safe by design in 8.0.  But we would be very interested in any test case that actually does cause a crash.
[9 Mar 2021 2:57] Shaohua Wang
Thank you for your detailed explaination, Kevin!

I've tried to write a test case to reproduce purge issue, but failed. There is another function dd_table_discard_tablespace() to update the table id, and store in dd(see thd->dd_client()->update(table_def)) in Sql_cmd_discard_import_tablespace::mysql_discard_or_import_tablespace()).

would you please check another export/import bug?
https://bugs.mysql.com/bug.php?id=102716
[9 Mar 2021 13:30] MySQL Verification Team
Hi Mr. Wang,

The other report is already filed in our internal bug system with all the details required.