Bug #96239 Undo might not cleanup entries of innodb_ddl_log due to exception
Submitted: 17 Jul 2019 20:39 Modified: 27 Aug 2019 12:12
Reporter: tony zhou Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: auto_inc, crash, undo

[17 Jul 2019 20:39] tony zhou
Description:
DDL will insert entries into innodb_ddl_log. 

In case of normal undo/rollback, we might remove the corresponding entries in primary index and secondary index (i.e., thread_id).  See detail in DDL_Log_Table::remove().

Bu if there is an exception in undo/rollback (e.g., raise "ib_create_table_fail_too_many_trx" error in trx_undo_assign_undo). We might only remove the primary index. the entries in the secondary index are not correctly removed. 

If in any case, the auto inc column (i.e., id in innodb_ddl_log) is also messed up (i.e., does not increased as expected), the follow up DDL might insert into the innodb_ddl_log with the exact same key (i.e., thread_id + id). Hence it will invokes row_ins_sec_index_entry_by_modify() instead of btr_cur_optimistic_insert() in row_ins_sec_index_entry_low(). Server will then crash at "ut_a(!cursor->index->is_committed());" because this should never happen.

How to repeat:
create table t1(c1 int primary key, c2 char(1), c3 int unsigned);
SET DEBUG='+d,ib_create_table_fail_too_many_trx';
ALTER TABLE t1 ADD INDEX (c2), ADD INDEX (c3);
// The innodb_ddl_log entries on the secondary index of "thread_id" are not correctly remove. 

reboot the server. 

create table t2(c1 int primary key, c2 char(1), c3 int unsigned);
SET DEBUG='+d,ib_create_table_fail_too_many_trx';
ALTER TABLE t2 ADD INDEX (c2), ADD INDEX (c3);
// If auto inc messed up (i.e., the same id is used), then server will crash here with stack trace as follow.
"Assertion failure: row0ins.cc:267:!cursor->index->is_committed()"

Suggested fix:
One potential solution might be remove the secondary index entries in innodb_ddl_log at server boot up time (i.e., remove at shutdown time might not work because server might crash and it will not invoke the normal shutdown code path).
[26 Jul 2019 12:12] MySQL Verification Team
Hello Mr. Zhou,

Thank you for your bug report.

We do need some more answers and explanations from you, before we proceed .......

First of all, what do you mean "reboot the server" ??? Re-start MySQL server or reboot the OS. The OS that I use is rebooted by properly closing everything. I do not intend to turn the computer off, since it might have bad consequences for my machine.

Next, when you write that primary key is correctly removed and secondary index has some entries left, please let us know the procedure on how can we determine that .....

Next, how can we determine that auto-increment index is messed up ????

Thanks in advance.
[27 Aug 2019 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".