Bug #100428 assertion at database shutdown
Submitted: 4 Aug 2020 17:14 Modified: 21 Sep 2020 16:24
Reporter: Justin Swanhart Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Data Dictionary Severity:S2 (Serious)
Version:8.0.21 OS:Any (Fedora 32)
Assigned to: CPU Architecture:Any

[4 Aug 2020 17:14] Justin Swanhart
Description:
I encountered an error at MySQL shutdown:
2020-08-04T17:01:12.363009Z 0 [System] [MY-013172] [Server] Received SHUTDOWN from user <via user signal>. Shutting down mysqld (Version: 8.0.21-debug).
2020-08-04T17:01:14.800934Z 0 [Warning] [MY-010909] [Server] /home/greenlion/warp/build_debug/runtime_output_directory/mysqld: Forcing close of thread 11  user: 'root'.
mysqld: /home/greenlion/warp/sql/dd/impl/cache/shared_multi_map.cc:96: void dd::cache::Shared_multi_map<T>::remove(dd::cache::Cache_element<T>*, dd::cache::Shared_multi_map<T>::Autolocker*) [with T = dd::Schema]: Assertion `(!id_key || m_map<typename T::Id_key>()->is_present(*id_key)) && (!name_key || m_map<typename T::Name_key>()->is_present(*name_key)) && (!aux_key || m_map<typename T::Aux_key>()->is_present(*aux_key))' failed.

I am developing a storage engine and I don't know if this bug is caused by my engine doing something wrong, but it doesn't support transactional DDL so I have not touched any dd structures.  There was one open connection, but no open transaction.

I am running -DWITH_DEBUG=1

How to repeat:
unknown, but I am not familiar with DD so I can't determine if I am doing something wrong, or if this is a DD bug?
[5 Aug 2020 12:46] MySQL Verification Team
Hi Justin,

Thank you for your bug report.

Congratulations on developing a new storage engine !!!

It is quite possible, but also highly unlikely that this is a bug in our Data Dictionary or our SE Plugin API. Can you check whether you followed all instructions regarding that Plugin ???

It is also quite irrelevant that your SE does not support transactional DDL.

Are you getting that problem ONLY at shutdown ???? In that case, we would advise you to look deeper into the initialisation stage of your storage engine.

Let us know how it goes !!!!!
[5 Aug 2020 12:51] MySQL Verification Team
Hi Justin,

Few more thoughts.

An error like that could occur if you missed to call some method during start or end of the DDL. Also, look more carefully in the part dealing with opening and closing of the tables in your SE.

That is all for now, until we get more feedback from you.

Good luck !!!!
[5 Aug 2020 20:29] Justin Swanhart
I didn't do any DDL while the server was up.  The table was already created before the server was started.  I only encountered this assertion once - it isn't something that is happening on a regular basis, and I had not made any changes to the SE when I encountered it.  So I am after what the root reason for that assertion is - what does the database have to do or not do, in order to trigger that assertion?  Unfortunately, because it is an assertion, even if I were to run the server under GDB all the time, I wouldn't have a stack trace unless you know of a way to get stack traces for assertions that I don't know about.
[6 Aug 2020 7:46] Sivert Sørumgård
Hello Justin, 

The shared DD cache is partitioned on object types, and the error log indicates that this happens in the schema partition (T = dd::Schema). Each partition has four maps hashed on different keys. The assert essentially makes sure that for each key that is non-null, there should be an entry in the corresponding map. The DD objects are wrapped in CacheElement<T> objects, and the keys are also stored in that wrapper. 

Unless you have changed the code for creating or dropping schemas, or the DD cache itself, this might be a problem in the DD cache code, but at the same time, I cannot recall having seen issues like this. There is code for making sure the non-null keys are added to the right maps when objects are inserted into the cache, and locking should ensure that adding and removing objects from the cache is atomic.

Since it happens after shutdown is initiated, it is probably encountered while the free list is removed, in Shared_multi_map::evict_all_unused(), called from Shared_multi_map::shutdown(). One possibility is to make the failing assert conditional and skip it during shutdown. There is an assert later in the cache shutdown code that will fail if there are objects left in the cache, after first dumping the objects that are left. This will provide more information about what the offending object is like.
[6 Aug 2020 12:32] MySQL Verification Team
Thank you, Sievert.

Just changing the status to the correct one.
[7 Sep 2020 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".