Bug #70139 Performance of "ALTER TABLE..." queries
Submitted: 23 Aug 2013 16:03 Modified: 23 Jun 2014 9:17
Reporter: Egor Korobitsin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB Plugin storage engine Severity:S5 (Performance)
Version:5.1.72 OS:Windows
Assigned to: CPU Architecture:Any
Tags: performance

[23 Aug 2013 16:03] Egor Korobitsin
Description:
The "ALTER TABLE" queries take much more time then it was in earlier versions.

How to repeat:
I've installed MySQL Server 5.6.13 on Windows7 and MySQL Server 5.0.51b on the another Windows7 on another machine. Through Mysql Command Line Cline I've run next queries:

create table test;
use table test;
create table test_table (id int, name varchar(50));
alter table test_table add column additional_column int;
alter table test_table modify column additional_column varchar(40);
alter table test_table drop column additional_column;

These three "alter table" queries take very different time in different versions:
MySQL Server 5.6.13, in seconds:  0.35 0.48, 0.39
MySQL Server 5.0.51b, in seconds: 0.05 0.03 0.03

I've prepared few file with settings of my MySQL Server 5.6.13:

SHOW VARIABLES LIKE 'innodb%'; - http://newproject72.bissnes.net/variables_innodb.csv 
SHOW VARIABLES LIKE '%buffer%'; - http://newproject72.bissnes.net/variables_buffer.csv 
SHOW VARIABLES LIKE 'sync%'; - http://newproject72.bissnes.net/variables_sync.csv 

In both servers the setting "innodb_flush_log_at_trx_commit=1" is set to "1".
[23 Aug 2013 16:47] MySQL Verification Team
This assumed to be already fixed:

http://bugs.mysql.com/bug.php?id=69316

Your MySQL connection id is 4
Server version: 5.6.14 Source distribution

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql 5.6 > use test
Database changed
mysql 5.6 > create table test_table (id int, name varchar(50));
Query OK, 0 rows affected (0.31 sec)

mysql 5.6 > alter table test_table add column additional_column int;
Query OK, 0 rows affected (0.41 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql 5.6 > alter table test_table modify column additional_column varchar(40);
Query OK, 0 rows affected (0.56 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql 5.6 > alter table test_table drop column additional_column;
Query OK, 0 rows affected (0.53 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql 5.6 > exit
Bye

C:\dbs>net start mysqld55
The MySQLD55 service is starting..
The MySQLD55 service was started successfully.

C:\dbs>55

C:\dbs>c:\dbs\5.5\bin\mysql -uroot --port=3550 --prompt="mysql 5.5 > "
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.5.34 Source distribution

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql 5.5 > use test
Database changed
mysql 5.5 > create table test_table (id int, name varchar(50));
Query OK, 0 rows affected (0.06 sec)

mysql 5.5 > alter table test_table add column additional_column int;
Query OK, 0 rows affected (0.19 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql 5.5 > alter table test_table modify column additional_column varchar(40);
Query OK, 0 rows affected (0.13 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql 5.5 > alter table test_table drop column additional_column;
Query OK, 0 rows affected (0.13 sec)
Records: 0  Duplicates: 0  Warnings: 0
[26 Aug 2013 7:45] Marko Mäkelä
ALGORITHM=COPY seems to be faster (on an empty table) than ALGORITHM=INPLACE.
I ran a test with a non-debug version of 5.5 and trunk, consisting of

create table test_table (id int, name varchar(50)) engine=innodb;
set old_alter_table=1; -- or 0 for trunk
alter table test_table add column additional_column int;
alter table test_table modify column additional_column varchar(40);
alter table test_table drop column additional_column;
... the 3 ALTER TABLE repeated 101 times
drop table test_table;

I ran the server on RAM disk, with page checksums disabled. Here are my results:

5.5: 0m1.141s
trunk: 0m0.402s with old_alter_table=1 (using ALGORITHM=COPY)
trunk: 0m2.277s with old_alter_table=0 (using ALGORITHM=INPLACE)

So, there clearly is some regression from ALGORITHM=INPLACE. The profiler gives me this primary suspect:

-   8,07%  mysqld  lfind
   - lfind
      - 88,27% lf_hash_search
         + 97,48% find_or_create_file(PFS_thread*, PFS_file_class*, char const*, unsigned int, bool)
         + 2,52% find_or_create_table_share(PFS_thread*, bool, TABLE_SHARE const*)

This is the busiest function when using old_alter_table=0 in trunk. The second place is held by the InnoDB internal SQL lexical scanner yylex(), called by que_eval_sql() to create, rename, or drop tables in the InnoDB internal data dictionary.

With old_alter_table=1 (ALGORITHM=COPY) in trunk, the busiest function is the InnoDB yylex(), at 7.74%. With ALGORITHM=INPLACE, the test takes 5 times the time, and the relative share of yylex() drops to 5.43%.

NOTE: I disabled a function in trunk, to work around Bug#69802:

=== modified file 'storage/innobase/dict/dict0stats.cc'
--- storage/innobase/dict/dict0stats.cc	revid:marko.makela@oracle.com-20130814113409-y9ifib5vrvuyhdb4
+++ storage/innobase/dict/dict0stats.cc	2013-08-26 07:22:11 +0000
@@ -173,6 +173,7 @@ dict_stats_persistent_storage_check(
 	bool	caller_has_dict_sys_mutex)	/*!< in: true if the caller
 						owns dict_sys->mutex */
 {
+#if 0
 	/* definition for the table TABLE_STATS_NAME */
 	dict_col_meta_t	table_stats_columns[] = {
 		{"database_name", DATA_VARMYSQL,
@@ -265,6 +266,9 @@ dict_stats_persistent_storage_check(
 	/* else */
 
 	return(true);
+#else
+	return(false);
+#endif
 }
 
 /*********************************************************************//**
[26 Aug 2013 8:55] Marko Mäkelä
I tried disabling the performance_schema in mysql-trunk, and it made little difference.

Now, old_alter_table=1 gives me 0m0.356s and 5.46% in yylex().
With old_alter_table=0, I get 0m2.519s and 5.67% in yylex(),
if looking at the mysqld executable only.

In the system-wide performance statistics, the penalty seems to come from the Linux kernel:

+   6,81%  mysqld  [kernel.kallsyms]    [k] copy_user_generic_unrolled
+   4,54%  mysqld  [kernel.kallsyms]    [k] clear_page
+   3,19%  mysqld  libc-2.13.so         [.] __memcpy_ssse3
+   2,69%  mysqld  [kernel.kallsyms]    [k] page_fault
+   2,39%  mysqld  mysqld               [.] yylex

row_log_allocate() and row_log_free() account for some of this. If I add LOCK=SHARED or LOCK=EXCLUSIVE (to skip the extra logging needed for online operation) to my script, the execution time drops to 2.1 seconds.

The rest is likely explained by row_merge_sort() and row_merge_insert_index_tuples() wanting to operate via a temporary file, even though all data fits in the sort buffer (because it is an empty table).

This is a bug that can be fixed. The performance bug has actually been there since InnoDB Plugin. This sorting code was previously only used for creating indexes, but in 5.6 it will also handle stuff like ADD/DROP COLUMN.
[26 Aug 2013 11:55] Marko Mäkelä
I repeated this regression in the InnoDB Plugin of MySQL 5.1 as well:

0m0.700s with old_alter_table=1
0m2.261s with old_alter_table=0

I used different DDL, which is supported by the InnoDB Plugin and MySQL 5.5:

create table test_table (id int, name varchar(50)) engine=innodb;
set old_alter_table=0; -- or 1
alter table test_table add index i(id), add index n10(name(10)), add index n20(name(20)), add index n30(name(30)), add index n40(name(30));
alter table test_table drop index i, drop index n10, drop index n20, drop index n30, drop index n40;
-- repeat the two ALTER 155 times
drop table test_table;
[26 Aug 2013 12:01] Marko Mäkelä
Note that I had to use multiple ADD INDEX in a single ALTER, because a single CREATE INDEX would actually be faster in the new API than the table-copying.
[23 Jun 2014 9:17] Erlend Dahl
[20 Jun 2014 4:58] Daniel T Price

Fixed as of 5.6.20, 5.7.5 and here's the changelog entry:

In-place "ALTER TABLE" operations requiring a table rebuild would sort
the clustered index even though the primary key order remained unchanged.
This behavior caused unnecessary temporary table usage and I/O.