MySQL Bugs: #62487: innodb takes 3 minutes to clean up the adaptive hash index at shutdown

Bug #62487	innodb takes 3 minutes to clean up the adaptive hash index at shutdown
Submitted:	21 Sep 2011 6:09	Modified:	5 Jan 2012 18:46
Reporter:	Mark Callaghan	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: InnoDB Plugin storage engine	Severity:	S5 (Performance)
Version:	5.1.52	OS:	Any
Assigned to:	Marko Mäkelä	CPU Architecture:	Any
Tags:	adaptive, hash, INDEX, innodb, shutdown, slow

Description:
I augmented the server to print timestamps on shutdown. With a 60G InnoDB buffer pool and 3M+ pages  (~240M rows), shutdown takes 3 minutes just to remove things from the adaptive hash index. So, 3 minutes wasted on this is 3 extra minutes of downtime on a mysqld restart or RPM upgrade.

From the augmented mysqld error log output

110920 22:56:23 [Note] Event Scheduler: Purging the queue. 0 events
110920 22:56:25  InnoDB: Starting shutdown...
110920 22:56:25  InnoDB: shutdown waiting for monitor threads
110920 22:56:28  InnoDB: Shutdown done
110920 22:56:29InnoDB: begin buf_pool_drop_hash_index
110920 22:59:29InnoDB: end buf_pool_drop_hash_index
110920 22:59:34  InnoDB: Shutdown completed; log sequence number 1856874472260
110920 22:59:34 [Note] /data/5152trxlimit/libexec/mysqld: Shutdown complete

mysqld uses all of 1 CPU core when this occurs. The common call stack is:
ha_remove_all_nodes_to_page,btr_search_drop_page_hash_index_low,buf_pool_drop_hash_index,btr_search_disable,innobase_shutdown_for_mysql,innobase_end,ha_finalize_handlerton,plugin_deinitialize,reap_plugins,plugin_shutdown,clean_up,unireg_end,kill_server_thread,start_thread,clone

How to repeat:
1) Run innodb with large buffer pool (60G+)
2) get 3M pages and 240M rows into the adaptive hash index
3) shutdown innodb

In my test, the workload was read-only. I used multi-table sysbench.

Suggested fix:
Make it faster. Why isn't it sufficient to deallocate the memory? buf_pool_drop_hash_index iterates over every page in the buffer pool...

buf_pool_drop_hash_index(void)
/*==========================*/
{
        ibool           released_search_latch;

#ifdef UNIV_SYNC_DEBUG
        ut_ad(rw_lock_own(&btr_search_latch, RW_LOCK_EX));
#endif /* UNIV_SYNC_DEBUG */
        ut_ad(!btr_search_enabled);

        do {
                buf_chunk_t*    chunks  = buf_pool->chunks;
                buf_chunk_t*    chunk   = chunks + buf_pool->n_chunks;

                released_search_latch = FALSE;

                while (--chunk >= chunks) {
                        buf_block_t*    block   = chunk->blocks;
                        ulint           i       = chunk->size;

                        for (; i--; block++) {
                                /* block->is_hashed cannot be modified
                                when we have an x-latch on btr_search_latch;
                                see the comment in buf0buf.h */

                                if (buf_block_get_state(block)
                                    != BUF_BLOCK_FILE_PAGE
                                    || !block->is_hashed) {
                                        continue;
                                }

                                /* To follow the latching order, we
                                have to release btr_search_latch
                                before acquiring block->latch. */
                                rw_lock_x_unlock(&btr_search_latch);
                                /* When we release the search latch,
                                we must rescan all blocks, because
                                some may become hashed again. */
                                released_search_latch = TRUE;

                                rw_lock_x_lock(&block->lock);

                                /* This should be guaranteed by the
                                callers, which will be holding
                                btr_search_enabled_mutex. */
                                ut_ad(!btr_search_enabled);

Mark, I believe that the main problem is the call to btr_search_drop_page_hash_index(block) that will compute a "fold value" of every record in every page that has ever been in the adaptive hash index.

When we drop an entire hash index, we can certainly drop it without removing each record individually. This change should not be too risky for 5.1 and 5.5.

Side note: I have been working on reducing btr_search_latch contention, and I noticed that we can have block->is_hashed=TRUE for pages even after all AHI references to it have been removed.

I have ported my patch to 5.1 plugin now. It should also speed up the following:

SET GLOBAL innodb_adaptive_hash_index=OFF;

Added to changelog:

The process of deallocating the InnoDB Adaptive Hash Index was made
faster, during shutdown or when turning off the AHI with the
statement:

SET GLOBAL innodb_adaptive_hash_index=OFF;

The changelog entry went in 5.1.60, 5.5.18, 5.6.4.