Bug #98421 Failed ALTER in custom SE corrupted data dictionary
Submitted: 28 Jan 2020 20:08 Modified: 4 Feb 2020 12:49
Reporter: Justin Swanhart Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:8.0.16 OS:Any
Assigned to: CPU Architecture:Any

[28 Jan 2020 20:08] Justin Swanhart
Description:
I am working on a new storage engine for MySQL 8.  An ALTER TABLE failed during the rename step and now I can't drop the database:

mysql> drop database test;
ERROR 3604 (HY000): Storage engine can't drop table 'test.#sql2-1ae4-9'
mysql> use test;
Database changed
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| customer       |
| dim_date       |
| lineorder      |
| part           |
| supplier       |
| t1             |
| t10            |
| t11            |
| t2             |
| t3             |
| t4             |
| t5             |
| t6             |
| t9             |
+----------------+
14 rows in set (0.02 sec)

How to repeat:
Not easy to do.  I can provide SE code if necessary.  Must succeed in ALTER table but RENAME fails, which itself is a bug.  I would think DD would fail because the ALTER returns error, but it doesn't fail.  This leaves an orphaned temporary table in the DD.

Now the ALTER on another table fails:
mysql> alter table t11 add column newcol text;

ERROR 1062 (23000): Duplicate entry '5-#sql-4b34_8' for key 'schema_id'
mysql> show create table t11\G
*************************** 1. row ***************************
       Table: t11
Create Table: CREATE TABLE `t11` (
  `c1` decimal(10,5) DEFAULT NULL
) ENGINE=MENSA DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

Suggested fix:
Unknown
[29 Jan 2020 8:58] Justin Swanhart
Note sure if this is related, but I get an assertion failure when I start mysql.  If not related, I will open a separate bug.

2020-01-29T08:55:41.651024Z 1 [ERROR] [MY-013183] [InnoDB] Assertion failure: fsp0fsp.cc:2095:inode thread 140737099269888
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.

Thread 2 "mysqld" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffe8cf0700 (LWP 16284)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffffc150801 in __GI_abort () at abort.c:79
#2  0x000000000c743263 in ut_dbg_assertion_failed (expr=0xe15d3f6 "inode", file=0xe15c0c8 "/home/greenlion/linux/ibis-master/storage/innobase/fsp/fsp0fsp.cc", line=2095)
    at /home/greenlion/linux/ibis-master/storage/innobase/ut/ut0dbg.cc:92
#3  0x000000000c92171e in fseg_inode_get (header=0x7fffd402c03c "\377\377\377", <incomplete sequence \357>, space=4294967279, page_size=..., mtr=0x7fffe8cedaf0, block=0x7fffe8ced988)
    at /home/greenlion/linux/ibis-master/storage/innobase/fsp/fsp0fsp.cc:2095
#4  0x000000000c9248f5 in fseg_alloc_free_page_general (seg_header=0x7fffd402c03c "\377\377\377", <incomplete sequence \357>, hint=273, direction=111 'o', has_done_reservation=1, mtr=0x7fffe8cedaf0, 
    init_mtr=0x7fffe8cedaf0) at /home/greenlion/linux/ibis-master/storage/innobase/fsp/fsp0fsp.cc:2929
#5  0x000000000c73e289 in trx_undo_add_page (trx=0x7fffe80b0470, undo=0x7fffe480f738, undo_ptr=0x7fffe80b07b0, mtr=0x7fffe8cedaf0)
    at /home/greenlion/linux/ibis-master/storage/innobase/trx/trx0undo.cc:761
#6  0x000000000c711dea in trx_undo_report_row_operation (flags=2, op_type=2, thr=0x7fffe4b54210, index=0x7fffe47d5ee8, clust_entry=0x0, update=0x7fffe4b53b70, cmpl_info=1, rec=0x7fffd40249d6 "", 
    offsets=0x7fffe8cee720, roll_ptr=0x7fffe8cee0f0) at /home/greenlion/linux/ibis-master/storage/innobase/trx/trx0rec.cc:2225
---Type <return> to continue, or q <return> to quit---
#7  0x000000000c7b63dd in btr_cur_upd_lock_and_undo (flags=2, cursor=0x7fffe4740978, offsets=0x7fffe8cee720, update=0x7fffe4b53b70, cmpl_info=1, thr=0x7fffe4b54210, mtr=0x7fffe8ceea40, 
    roll_ptr=0x7fffe8cee0f0) at /home/greenlion/linux/ibis-master/storage/innobase/btr/btr0cur.cc:3149
#8  0x000000000c7b710f in btr_cur_update_in_place (flags=2, cursor=0x7fffe4740978, offsets=0x7fffe8cee720, update=0x7fffe4b53b70, cmpl_info=1, thr=0x7fffe4b54210, trx_id=28673, mtr=0x7fffe8ceea40)
    at /home/greenlion/linux/ibis-master/storage/innobase/btr/btr0cur.cc:3431
#9  0x000000000c7b7ac0 in btr_cur_optimistic_update (flags=2, cursor=0x7fffe4740978, offsets=0x7fffe8cee600, heap=0x7fffe8cee6d8, update=0x7fffe4b53b70, cmpl_info=1, thr=0x7fffe4b54210, trx_id=28673, 
    mtr=0x7fffe8ceea40) at /home/greenlion/linux/ibis-master/storage/innobase/btr/btr0cur.cc:3584
#10 0x000000000c692801 in row_upd_clust_rec (flags=0, node=0x7fffe4b53a38, index=0x7fffe47d5ee8, offsets=0x7fffe8cee720, offsets_heap=0x7fffe8cee6d8, thr=0x7fffe4b54210, mtr=0x7fffe8ceea40)
    at /home/greenlion/linux/ibis-master/storage/innobase/row/row0upd.cc:2532
#11 0x000000000c693309 in row_upd_clust_step (node=0x7fffe4b53a38, thr=0x7fffe4b54210) at /home/greenlion/linux/ibis-master/storage/innobase/row/row0upd.cc:2791
#12 0x000000000c6937bc in row_upd (node=0x7fffe4b53a38, thr=0x7fffe4b54210) at /home/greenlion/linux/ibis-master/storage/innobase/row/row0upd.cc:2878
#13 0x000000000c693cc8 in row_upd_step (thr=0x7fffe4b54210) at /home/greenlion/linux/ibis-master/storage/innobase/row/row0upd.cc:3008
#14 0x000000000c61dfa1 in row_update_for_mysql_using_upd_graph (mysql_rec=0x7fffe4b51e00 "\001\267", prebuilt=0x7fffe4b52fb8) at /home/greenlion/linux/ibis-master/storage/innobase/row/row0mysql.cc:2307
#15 0x000000000c61e35d in row_update_for_mysql (mysql_rec=0x7fffe4b51e00 "\001\267", prebuilt=0x7fffe4b52fb8) at /home/greenlion/linux/ibis-master/storage/innobase/row/row0mysql.cc:2395
#16 0x000000000c427164 in ha_innobase::update_row (this=0x7fffe4b50658, old_row=0x7fffe4b51e00 "\001\267", new_row=0x7fffe4b51d18 "\001\267")
    at /home/greenlion/linux/ibis-master/storage/innobase/handler/ha_innodb.cc:8952
#17 0x000000000b0f3d07 in handler::ha_update_row (this=0x7fffe4b50658, old_data=0x7fffe4b51e00 "\001\267", new_data=0x7fffe4b51d18 "\001\267") at /home/greenlion/linux/ibis-master/sql/handler.cc:7696
#18 0x000000000c153496 in dd::Raw_record::update (this=0x7fffe4658510) at /home/greenlion/linux/ibis-master/sql/dd/impl/raw/raw_record.cc:63
#19 0x000000000c20b0cc in dd::Weak_object_impl::store (this=0x7fffe47d5500, otx=0x7fffe8cef4e0) at /home/greenlion/linux/ibis-master/sql/dd/impl/types/weak_object_impl.cc:108
#20 0x000000000c13ebcc in dd::cache::Storage_adapter::store<dd::Collation> (thd=0xf691470, object=0x7fffe47d5538) at /home/greenlion/linux/ibis-master/sql/dd/impl/cache/storage_adapter.cc:315
#21 0x000000000c1899cc in dd::tables::Collations::populate (this=0x7fffe44ae7d0, thd=0xf691470) at /home/greenlion/linux/ibis-master/sql/dd/impl/tables/collations.cc:152
#22 0x000000000c024733 in (anonymous namespace)::repopulate_charsets_and_collations (thd=0xf691470) at /home/greenlion/linux/ibis-master/sql/dd/impl/bootstrap/bootstrapper.cc:644
#23 0x000000000c025dda in dd::bootstrap::restart (thd=0xf691470) at /home/greenlion/linux/ibis-master/sql/dd/impl/bootstrap/bootstrapper.cc:935
#24 0x000000000c22d27b in dd::upgrade_57::restart_dictionary (thd=0xf691470) at /home/greenlion/linux/ibis-master/sql/dd/upgrade_57/upgrade.cc:794
#25 0x000000000c22dbbf in dd::upgrade_57::do_pre_checks_and_initialize_dd (thd=0xf691470) at /home/greenlion/linux/ibis-master/sql/dd/upgrade_57/upgrade.cc:950
#26 0x000000000b06ce74 in bootstrap::handle_bootstrap (arg=0xf691470) at /home/greenlion/linux/ibis-master/sql/bootstrap.cc:347
#27 0x000000000ca6831c in pfs_spawn_thread (arg=0xf5a2620) at /home/greenlion/linux/ibis-master/storage/perfschema/pfs.cc:2836
#28 0x00007fffff1e76db in start_thread (arg=0x7fffe8cf0700) at pthread_create.c:463
#29 0x00007ffffc23188f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
[29 Jan 2020 13:18] MySQL Verification Team
Hi Justin,

How are you ???

First of all, regarding the assertion, you could be running out of disk space or out of inodes. It is hard to say, but you should try to analyse the log or debug InnoDB SE at the point when that inode is allocated.

Next, you are developing a new storage engine, while you report an assertion in InnoDB SE. I truly do not see any connection between these two.

Next, if you are offering to send us the source of your SE, then it would practically mean that you are asking for a free help to develop your storage engine. I will contact the relevant people of whether we are supposed to do it or not.

Next, you think that data dictionary is corrupted. Can you show the proof of that and how have you managed to accomplish that ???

Last, but not least, you are using 8.0.16, while current release is 8.0.19 and there are many bugs that have been fixed interim.
[29 Jan 2020 14:10] Ståle Deraas
Hi Justin,

Does your SE support atomic DDL? If so, to us it might indicate that you have a bug in your SE. 
If your SE is not supporting atomic DDL, such a situation might potentially happen. To clean up the DD, you can try "DROP table IF EXISTS".

Hard to tell if the failure in InnoDB is related to the above.
[29 Jan 2020 18:58] Justin Swanhart
The SE doesn't explicitly support atomic DDL.  Is there documentation on how that works?  I'm using ha_tina.cc as the canonical example right now as I don't yet support anything fancy like transactions and it doesn't seem to do anything special regarding DDL.

I'm not asking for help developing my SE.  My SE naturally is in development (and it is my first SE so I'm learning a lot) and so it has bugs which cause failures in MySQL that are unexpected.  The InnoDB related messages are DD related and happened after a crash (so crash recovery of the DD failed).  While these failures are "sort of" expected since I'm developing, random failures happen in the real world (due to hardware, bugs in kernel, etc) and so I'd like to track down any corruptions to the dictionary or to innodb itself during an unexpected crash.  Thus the bug report.

ALso, not being able to ALTER a completely different table (duplicate schema id) seems to indicate DD corruption.  I'm assuming the DD gets involved without explicit intervention at the layer above the SE interface.
[30 Jan 2020 12:46] MySQL Verification Team
Hi Justin,

Please clear out something for us.

If I understand you correctly, your bug report boils down to being able to track down any corruptions to the dictionary or to innodb itself during an unexpected crash.

Am I correct ????
[30 Jan 2020 12:58] Justin Swanhart
Yes.  In this case, my storage engine wasn't able to drop the table during the RENAME phase of an alter because a file hadn't been removed.

When an ALTER fails in my engine because the table cant be dropped (MySQL crashes as a result) the table is ACTUALLY ALTERED and an orphaned table remains in the DD.

I think you can repeat by introducing a crash in ha_foo:delete_table(...)
[30 Jan 2020 20:46] Justin Swanhart
The same thing is true of ha_foo::create -- if I crash the database during the ha_create() call, then the table is still created in the data dictionary when MySQL restarts.
[30 Jan 2020 20:48] Justin Swanhart
mysql> create table t1(c1 int) engine=mensa;
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql> create table t1(c1 int) engine=mensa;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    8
Current database: test

ERROR 1050 (42S01): Table 't1' already exists
[3 Feb 2020 14:20] Dmitry Lenev
Hello Justin!

What you observe above in case of CREATE TABLE crash is expected behavior for SEs which don't support atomic DDL. Behavior for such SEs in case of crashes during DDL has not improved much as compared to 5.7/.FRM-based data-dictionary.
The workaround for the problem is to do DROP TABLE IF EXISTS on half-created table.

Regarding problem with ALTER TABLE, we really need to get more info to decide if behavior is a bug or not. When you say that there is a failure during rename step of ALTER TABLE, do you mean that there is a crash? Or just error is emitted? Perhaps you can point to exact step at which failure happens?
Could you please clarify?
[3 Feb 2020 17:47] Justin Swanhart
I think the ALTER failure would be hard to reproduce and you can close this bug.
[4 Feb 2020 12:49] MySQL Verification Team
Thank you, Justin !!!!!!