| Bug #96412 | Mess usages of latch meta data for InnoDB latches (mutex and rw_lock) | ||
|---|---|---|---|
| Submitted: | 2 Aug 2019 12:06 | Modified: | 5 Aug 2019 12:46 |
| Reporter: | Fungo Wang (OCA) | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: InnoDB storage engine | Severity: | S6 (Debug Builds) |
| Version: | 8.0.17 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | latch, latch meta, mutex, rw_lock | ||
[2 Aug 2019 12:06]
Fungo Wang
[2 Aug 2019 18:24]
MySQL Verification Team
Hi Mr. Wang, Thank you for your bug report. We do require some clarifications. How do you know that LATCH_ID_INDEX_ONLINE_LOG is for a mutex ONLY ??? How do you know that dict_operation_lock can be used for mutex only ??? How do you know that SYNC_INDEX_TREE can not be used for different sync levels ???? How do you know that table->stats_latch and table->stats_latch can not use LATCH_ID_INDEX_TREE ??? Last, but not least, seems that S6 would be a correct severity here, since you are using some debug-only commands.
[4 Aug 2019 8:25]
Fungo Wang
Hi Sinisa,
Let me explain it more clearly (I should have put all these info in my initial bug report).
The current usage of latch_id and sync level will not cause crash, but I think it will make trouble for mysqld diagnose under some circumstances.
latch_meta is used to maintain a relation for latch_id, latch_level, pfs_key and latch_counter, each element in latch_meta vector contains:
(latch_id, latch_name, latch_level, latch_level_name, pfs_key, latch_counter)
InnoDB Mutex and RW lock are implemented differently, and for the latch_meta vector initialization (i.e.. sync_latch_meta_init()), two different macros are used to add mutex and RW lock.
1. LATCH_ADD_MUTEX()
2. LATCH_ADD_RWLOCK()
The defination of these two macros are controlled by #ifdef UNIV_PFS_MUTEX and #ifdef UNIV_PFS_RWLOCK, which decide whether mutex or rw lock is instrumented in P_S.
> How do you know that LATCH_ID_INDEX_ONLINE_LOG is for a mutex ONLY ???
1443 LATCH_ADD_RWLOCK(INDEX_ONLINE_LOG, SYNC_INDEX_ONLINE_LOG,
1444 index_online_log_key);
LATCH_ID_INDEX_ONLINE_LOG is initialized through RW_LOCK, so if you want to use create a mutex, it's not a good idea to use a rw_lock latch_id.
If you check the code to initialzed a mutex, you can see the latch_id is used fetch the corresponding PFS key.
So a pfs_key for rw_lock will be used for a mutex, which will lead confusing if the user check P_S table.
And think in such situation:
the rw_lock P_S instrumentation is disabled (PFS_NOT_INSTRUMENTED), while the mutex P_S instrumentation is not, if a mutex initialized with a rw_lock latch id, you will not see info for such mutex in P_S table.
815 @param[in] id Mutex ID
816 @param[in] filename file where created
817 @param[in] line line number in file where created */
818 void init(latch_id_t id, const char *filename, uint32_t line) UNIV_NOTHROW {
819#ifdef UNIV_PFS_MUTEX
820 pfs_add(sync_latch_get_pfs_key(id));
821#endif /* UNIV_PFS_MUTEX */
822
823 m_impl.init(id, filename, line);
824 }
> How do you know that dict_operation_lock can be used for mutex only ???
I guess you mean "How do you know that SYNC_DICT_OPERATION (a sync level) can be used for mutex only ???"
Different kind of mutex or rw lock used different sync level, thus the debug code can detect sync_level confliction.
1374 LATCH_ADD_MUTEX(SRV_DICT_TMPFILE, SYNC_DICT_OPERATION,
1375 srv_dict_tmpfile_mutex_key);
1464 LATCH_ADD_RWLOCK(DICT_OPERATION, SYNC_DICT, dict_operation_lock_key);
Through the above two initialiaztion, I assume that
SYNC_DICT_OPERATION is for LATCH_ID_SRV_DICT_TMPFILE mutex, and SYNC_DICT is used for LATCH_ID_DICT_OPERATION RW lock.
So I guess dict_operation_lock rw_lock should use sync level SYNC_DICT, instead of SYNC_DICT_OPERATION.
1034 rw_lock_create(dict_operation_lock_key, dict_operation_lock,
1035 SYNC_DICT_OPERATION);
> How do you know that SYNC_INDEX_TREE can not be used for different sync levels ????
As SYNC_INDEX_TREE is a sync level, I guess you mean "How do you know that SYNC_INDEX_TREE can not be used for different rw lock ????"
1486 LATCH_ADD_RWLOCK(INDEX_TREE, SYNC_INDEX_TREE, index_tree_rw_lock_key);
1487
1488 LATCH_ADD_RWLOCK(DICT_TABLE_STATS, SYNC_INDEX_TREE, dict_table_stats_key);
If you check the rw_lock_create_func() function:
241 lock->m_id = sync_latch_get_id(sync_latch_get_name(level));
This will cause both table->stats_latch and new_index->lock initialized latch id (lock->m_id) with LATCH_ID_INDEX_TREE, even if new_index->lock should be new_index->lock->m_id should be LATCH_ID_DICT_TABLE_STATS.
323: rw_lock_create(dict_table_stats_key, table->stats_latch, SYNC_INDEX_TREE);
2447: rw_lock_create(index_tree_rw_lock_key, &new_index->lock, SYNC_INDEX_TREE);
This will not cause too much trouble, but will cause LatchDebug::print_latches and LatchDebug::crash print wrong info if invoked:
537 ib::error(ER_IB_MSG_1163)
538#endif /* UNIV_NO_ERR_MSGS */
539 << "Thread " << os_thread_get_curr_id() << " already owns a latch "
540 << sync_latch_get_name(latch->m_id) << " at level"
541 << " " << latched->m_level << " (" << latch_level_name
542 << " ), which is at a lower/same level than the"
543 << " requested latch: " << level << " (" << in_level_name << "). "
544 << latch->to_string();
The sync_latch_get_name(latch->m_id) will give unexpected name ("INDEX_TREE") for all table->stats_latch (which should be "DICT_TABLE_STATS") .
Hope the above could also answer the 4th question.
> How do you know that table->stats_latch and table->stats_latch can not use LATCH_ID_INDEX_TREE ???
Thanks.
[4 Aug 2019 8:30]
Fungo Wang
A correction of previous comment: This will cause both table->stats_latch and new_index->lock initialized latch id (lock->m_id) with LATCH_ID_INDEX_TREE, even if new_index->lock should be new_index->lock->m_id should be LATCH_ID_DICT_TABLE_STATS. ==== should be ===> This will cause both table->stats_latch and new_index->lock initialized latch id (lock->m_id) with LATCH_ID_INDEX_TREE, even if table->stats_latch->m_id should be LATCH_ID_DICT_TABLE_STATS.
[5 Aug 2019 12:46]
MySQL Verification Team
Thank you Mr. Wang, I have analysed all the info provided by you and I agree with your conclusion. I am changing only the severity level. Verified.
