Bug #82176 Reuse dummy index/dummy table to speedup crash recovery
Submitted: 9 Jul 2016 4:46 Modified: 11 Jul 2016 7:59
Reporter: zhai weixiang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.7, 5.7.13 OS:Any
Assigned to: CPU Architecture:Any

[9 Jul 2016 4:46] zhai weixiang
Description:
While investigating the crash recovery of MySQL 5.7, I found that some CPU time is spent on creating dummy index and dummy table.

Quoted from Output of perf record during crash recovery:

+  12.42%  mysqld  mysqld               [.] recv_recover_page_func(unsigned long, buf_block_t*)                                                                            ▒
+  10.94%  mysqld  mysqld               [.] recv_add_to_hash_table(mlog_id_t, unsigned long, unsigned long, unsigned char*, unsigned char*, unsigned long, unsigned long)  ◆
-   6.12%  mysqld  libc-2.15.so         [.] __strcmp_sse42                                                                                                                 ▒
   - __strcmp_sse42                                                                                                                                                        ▒
      - 89.89% ut_allocator<unsigned char>::get_mem_key(char const*) const [clone .isra.31]                                                                                ▒
         - mem_heap_create_block_func(mem_block_info_t*, unsigned long, unsigned long)                                                                                     ▒
            - 61.40% mem_heap_add_block(mem_block_info_t*, unsigned long)                                                                                                  ▒
               - 47.96% dict_mem_index_create(char const*, char const*, unsigned long, unsigned long, unsigned long)                                                       ▒
                    mlog_parse_index(unsigned char*, unsigned char const*, unsigned long, dict_index_t**)                                                                  ▒
                  + recv_parse_or_apply_log_rec_body(mlog_id_t, unsigned char*, unsigned char*, unsigned long, unsigned long, bool, buf_block_t*, mtr_t*)                  ▒
               - 34.13% dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)                                      ▒
                    mlog_parse_index(unsigned char*, unsigned char const*, unsigned long, dict_index_t**)                                                                  ▒
                  + recv_parse_or_apply_log_rec_body(mlog_id_t, unsigned char*, unsigned char*, unsigned long, unsigned long, bool, buf_block_t*, mtr_t*)                  ▒
               + 15.69% mem_heap_strdup(mem_block_info_t*, char const*)                                                                                                    ▒
               + 2.22% rec_get_offsets_func(unsigned char const*, dict_index_t const*, unsigned long*, unsigned long, mem_block_info_t**)                                  ▒
            - 15.48% dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)                                         ▒
                 mlog_parse_index(unsigned char*, unsigned char const*, unsigned long, dict_index_t**)                                                                     ▒
               + recv_parse_or_apply_log_rec_body(mlog_id_t, unsigned char*, unsigned char*, unsigned long, unsigned long, bool, buf_block_t*, mtr_t*)                     ▒
            + 13.73% btr_cur_parse_update_in_place(unsigned char*, unsigned char*, unsigned char*, page_zip_des_t*, dict_index_t*)                                         ▒
            - 9.39% dict_mem_index_create(char const*, char const*, unsigned long, unsigned long, unsigned long)                                                           ▒
                 mlog_parse_index(unsigned char*, unsigned char const*, unsigned long, dict_index_t**)                                                                     ▒
               + recv_parse_or_apply_log_rec_body(mlog_id_t, unsigned char*, unsigned char*, unsigned long, unsigned long, bool, buf_block_t*, mtr_t*)                     ▒
      - 10.03% ut_allocator<unsigned char>::get_mem_key(char const*) const [clone .isra.25]                                                                                ▒
           ut_allocator<unsigned char>::allocate(unsigned long, unsigned char const*, char const*, bool, bool) [clone .constprop.101]                                      ▒
           dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)                                                   ▒
           mlog_parse_index(unsigned char*, unsigned char const*, unsigned long, dict_index_t**)                                                                           ▒
         + recv_parse_or_apply_log_rec_body(mlog_id_t, unsigned char*, unsigned char*, unsigned long, unsigned long, bool, buf_block_t*, mtr_t*)   

Actually these costs are not necessary, because dummy index and dummy table could be reused, and you just need to reset some variables before using it.

How to repeat:
Kill the server under heavy workload and restart it, then check the output of `perf record`

Suggested fix:
Use something like std::multiset<int, dict_index_t*> to maintain the created dummy index (you can access dummy table via dummy_index->table), the first element means column number of the dummy index.

A newly created index will be added to the set. And then if it is required while parsing/applying redo log, remove it from the set, and reset some variables of the index to zero:

dict_index_t::n_def
dict_index_t::type
dict_index_t::n_nullable
dict_table_t::n_def

Also it's not necessary to add system columns for a reused dummy table, only need to update dict_table_t::n_def += DATA_N_SYS_COLS;
[11 Jul 2016 7:59] MySQL Verification Team
Hello Zhai,

Thank you for the report and feedback.

Thanks,
Umesh