Bug #96412 | Mess usages of latch meta data for InnoDB latches (mutex and rw_lock) | ||
---|---|---|---|
Submitted: | 2 Aug 2019 12:06 | Modified: | 5 Aug 2019 12:46 |
Reporter: | Fungo Wang (OCA) | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S6 (Debug Builds) |
Version: | 8.0.17 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | latch, latch meta, mutex, rw_lock |
[2 Aug 2019 12:06]
Fungo Wang
[2 Aug 2019 18:24]
MySQL Verification Team
Hi Mr. Wang, Thank you for your bug report. We do require some clarifications. How do you know that LATCH_ID_INDEX_ONLINE_LOG is for a mutex ONLY ??? How do you know that dict_operation_lock can be used for mutex only ??? How do you know that SYNC_INDEX_TREE can not be used for different sync levels ???? How do you know that table->stats_latch and table->stats_latch can not use LATCH_ID_INDEX_TREE ??? Last, but not least, seems that S6 would be a correct severity here, since you are using some debug-only commands.
[4 Aug 2019 8:25]
Fungo Wang
Hi Sinisa, Let me explain it more clearly (I should have put all these info in my initial bug report). The current usage of latch_id and sync level will not cause crash, but I think it will make trouble for mysqld diagnose under some circumstances. latch_meta is used to maintain a relation for latch_id, latch_level, pfs_key and latch_counter, each element in latch_meta vector contains: (latch_id, latch_name, latch_level, latch_level_name, pfs_key, latch_counter) InnoDB Mutex and RW lock are implemented differently, and for the latch_meta vector initialization (i.e.. sync_latch_meta_init()), two different macros are used to add mutex and RW lock. 1. LATCH_ADD_MUTEX() 2. LATCH_ADD_RWLOCK() The defination of these two macros are controlled by #ifdef UNIV_PFS_MUTEX and #ifdef UNIV_PFS_RWLOCK, which decide whether mutex or rw lock is instrumented in P_S. > How do you know that LATCH_ID_INDEX_ONLINE_LOG is for a mutex ONLY ??? 1443 LATCH_ADD_RWLOCK(INDEX_ONLINE_LOG, SYNC_INDEX_ONLINE_LOG, 1444 index_online_log_key); LATCH_ID_INDEX_ONLINE_LOG is initialized through RW_LOCK, so if you want to use create a mutex, it's not a good idea to use a rw_lock latch_id. If you check the code to initialzed a mutex, you can see the latch_id is used fetch the corresponding PFS key. So a pfs_key for rw_lock will be used for a mutex, which will lead confusing if the user check P_S table. And think in such situation: the rw_lock P_S instrumentation is disabled (PFS_NOT_INSTRUMENTED), while the mutex P_S instrumentation is not, if a mutex initialized with a rw_lock latch id, you will not see info for such mutex in P_S table. 815 @param[in] id Mutex ID 816 @param[in] filename file where created 817 @param[in] line line number in file where created */ 818 void init(latch_id_t id, const char *filename, uint32_t line) UNIV_NOTHROW { 819#ifdef UNIV_PFS_MUTEX 820 pfs_add(sync_latch_get_pfs_key(id)); 821#endif /* UNIV_PFS_MUTEX */ 822 823 m_impl.init(id, filename, line); 824 } > How do you know that dict_operation_lock can be used for mutex only ??? I guess you mean "How do you know that SYNC_DICT_OPERATION (a sync level) can be used for mutex only ???" Different kind of mutex or rw lock used different sync level, thus the debug code can detect sync_level confliction. 1374 LATCH_ADD_MUTEX(SRV_DICT_TMPFILE, SYNC_DICT_OPERATION, 1375 srv_dict_tmpfile_mutex_key); 1464 LATCH_ADD_RWLOCK(DICT_OPERATION, SYNC_DICT, dict_operation_lock_key); Through the above two initialiaztion, I assume that SYNC_DICT_OPERATION is for LATCH_ID_SRV_DICT_TMPFILE mutex, and SYNC_DICT is used for LATCH_ID_DICT_OPERATION RW lock. So I guess dict_operation_lock rw_lock should use sync level SYNC_DICT, instead of SYNC_DICT_OPERATION. 1034 rw_lock_create(dict_operation_lock_key, dict_operation_lock, 1035 SYNC_DICT_OPERATION); > How do you know that SYNC_INDEX_TREE can not be used for different sync levels ???? As SYNC_INDEX_TREE is a sync level, I guess you mean "How do you know that SYNC_INDEX_TREE can not be used for different rw lock ????" 1486 LATCH_ADD_RWLOCK(INDEX_TREE, SYNC_INDEX_TREE, index_tree_rw_lock_key); 1487 1488 LATCH_ADD_RWLOCK(DICT_TABLE_STATS, SYNC_INDEX_TREE, dict_table_stats_key); If you check the rw_lock_create_func() function: 241 lock->m_id = sync_latch_get_id(sync_latch_get_name(level)); This will cause both table->stats_latch and new_index->lock initialized latch id (lock->m_id) with LATCH_ID_INDEX_TREE, even if new_index->lock should be new_index->lock->m_id should be LATCH_ID_DICT_TABLE_STATS. 323: rw_lock_create(dict_table_stats_key, table->stats_latch, SYNC_INDEX_TREE); 2447: rw_lock_create(index_tree_rw_lock_key, &new_index->lock, SYNC_INDEX_TREE); This will not cause too much trouble, but will cause LatchDebug::print_latches and LatchDebug::crash print wrong info if invoked: 537 ib::error(ER_IB_MSG_1163) 538#endif /* UNIV_NO_ERR_MSGS */ 539 << "Thread " << os_thread_get_curr_id() << " already owns a latch " 540 << sync_latch_get_name(latch->m_id) << " at level" 541 << " " << latched->m_level << " (" << latch_level_name 542 << " ), which is at a lower/same level than the" 543 << " requested latch: " << level << " (" << in_level_name << "). " 544 << latch->to_string(); The sync_latch_get_name(latch->m_id) will give unexpected name ("INDEX_TREE") for all table->stats_latch (which should be "DICT_TABLE_STATS") . Hope the above could also answer the 4th question. > How do you know that table->stats_latch and table->stats_latch can not use LATCH_ID_INDEX_TREE ??? Thanks.
[4 Aug 2019 8:30]
Fungo Wang
A correction of previous comment: This will cause both table->stats_latch and new_index->lock initialized latch id (lock->m_id) with LATCH_ID_INDEX_TREE, even if new_index->lock should be new_index->lock->m_id should be LATCH_ID_DICT_TABLE_STATS. ==== should be ===> This will cause both table->stats_latch and new_index->lock initialized latch id (lock->m_id) with LATCH_ID_INDEX_TREE, even if table->stats_latch->m_id should be LATCH_ID_DICT_TABLE_STATS.
[5 Aug 2019 12:46]
MySQL Verification Team
Thank you Mr. Wang, I have analysed all the info provided by you and I agree with your conclusion. I am changing only the severity level. Verified.