Bug #74472 ddl like add index is very slow in 5.7.5
Submitted: 21 Oct 2014 6:45 Modified: 29 Apr 2015 12:33
Reporter: zhai weixiang (OCA) Email Updates:
Status: Closed Impact on me:
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:5.7.5 OS:Any
Assigned to: CPU Architecture:Any

[21 Oct 2014 6:45] zhai weixiang
While adding index to a small table, a checkpoint is triggered.....And it takes unexpected long time to finish the DDL.

related backtrace:

1 io_submit(libaio.so.1),os_aio_linux_dispatch(os0file.cc:4450),os_aio_func(os0file.cc:4450),pfs_os_aio_func(os0file.ic:241),fil_io(os0file.ic:241),buf_dblwr_write_block_to_datafile(buf0dblwr.cc:787),buf_dblwr_flush_buffered_writes(buf0dblwr.cc:933),buf_dblwr_add_to_batch(buf0dblwr.cc:1015),buf_flush_write_block_low(buf0flu.cc:986),buf_flush_page(buf0flu.cc:986),buf_flush_try_neighbors(buf0flu.cc:1329),buf_flush_page_and_try_neighbors(buf0flu.cc:1329),buf_do_flush_list_batch(buf0flu.cc:1662),buf_flush_batch(buf0flu.cc:1662),buf_flush_do_batch(buf0flu.cc:1662),buf_flush_lists(buf0flu.cc:1963),log_preflush_pool_modified_pages(log0log.cc:1383),log_make_checkpoint_at(log0log.cc:1383),row_merge_build_indexes(row0merge.cc:4264),ha_innobase::inplace_alter_table(handler0alter.cc:4462),ha_inplace_alter_table(handler.h:3242),mysql_inplace_alter_table(handler.h:3242),mysql_alter_table(sql_table.cc:8603),Sql_cmd_alter_table::execute(sql_alter.cc:315),mysql_execute_command(sql_parse.cc:4815),mysql_parse(sql_parse.cc:5427),dispatch_command(sql_parse.cc:1250),handle_connection(connection_handler_per_thread.cc:298),pfs_spawn_thread(pfs.cc:2137),start_thread(libpthread.so.0),clone(libc.so.6)

How to repeat:
1. start mysql with high concurrency DML workload..

root@test 02:43:45>create table t1 (a int primary key, b int);
Query OK, 0 rows affected (0.03 sec)

root@test 02:43:50>alter table t1 add key(b);
Query OK, 0 rows affected (14.91 sec)
Records: 0  Duplicates: 0  Warnings: 0

Suggested fix:
fix it...
[21 Oct 2014 6:59] zhai weixiang
edit Version
[21 Oct 2014 14:27] MySQL Verification Team

Thank you for reporting this issue. I have several questions, but will start here with the most important one.

Can you repeat the problem ??? If you run your test case 5 (five) times, does it repeat regularly all 5 times.

This could be a simple I/O saturation. Can you try repeating the problem with DMLs being run on the database residing on one disk, while ADD INDEX, or ALTER TABLE, is ran on the table in another database residing on a totally separate disk.

Thank you in advance.
[21 Oct 2014 14:41] MySQL Verification Team
This turned out to be a true bug, hence it is verified. No replies are needed from you.

There were some changes which lead to excessive flushing in 5.7, which resulted in the phenomena that you observed. As the consequence of these changes lead to much slower DDL operations, this is bug is classified as performance enhancement.

Thank you for your contribution.
[21 Oct 2014 14:44] zhai weixiang
Thanks Sinisa.  I almost send my feedback :)  
Can you take a look of another bug with a similar reason:
[30 Dec 2014 7:41] Shaohua Wang
Posted by developer:
Sunny suggested two ways, and preferred #2.
1. Change the order of the pages on the flush list, give priority to the index pages
2. Add a function to flush the pages for a particular index/table
[30 Dec 2014 7:49] Marko Mäkelä
For the record, this problem was also mentioned in
Bug#72115 Very slow create/alter table with partitions and InnoDB
[30 Dec 2014 8:10] Marko Mäkelä
Posted by developer:
We have the strict correctness requirement that before enabling redo logging on some pages, these pages have to be flushed from the buffer pool to the data files.
Currently, we are doing this flushing by invoking log_make_checkpoint_at() in row_merge_build_indexes(). The log checkpoint will fulfill the correctness requirement, but it will also unnecessarily flush changes to unrelated tablespaces and to the redo log.

I think that we need some way to flush all pages that belong to a particular (space_id,index_id). One way of guaranteeing that would be to split the buf_pool->flush_list. Those index trees for which redo logging is temporarily disabled would be removed from the main flush_list, and put to a collection of flush lists that is indexed by (space_id,index_id).

Note that in WL#7277 InnoDB: Bulk Load for Create Index (which caused this performance regression by introducing the log checkpoints), all non-redo-logged pages always belong to some index tree, identified by (space_id,index_id). There exist also some common pages related to page allocation and bookkeeping, but the modifications of these pages are always covered by the redo log. Therefore we do not need to force flushing of all pages in a given space_id. (If we simply flushed all dirty pages belonging to a particular space_id, then we would have to be careful not to break the write-ahead-logging rules for those pages for which redo logging is already enabled.)

It seems to me that we can safely remove the log checkpoint calls and instead flush the relevant pages for all affected (space_id,index_id). This special flushing would be a cut-off point: after this point, the block will be moved to the buf_pool->flush_list, and redo logging must be enabled for all subsequent modifications.
[29 Apr 2015 12:33] Daniel Price
Fixed as of the upcoming 5.7.8, 5.8.0 releases, and here's the changelog entry:

The sorted index build feature introduced in MySQL 5.7.5 caused a
performance regression when adding an index to a small table. The
regression was due to excessive flushing triggered by a forced checkpoint
that occurs after the sorted index build. 

Thank you for the bug report.