Bug #57583 fast index create not used during "alter table foo engine=innodb"
Submitted: 19 Oct 2010 21:12 Modified: 2 Oct 2013 17:47
Reporter: Mark Callaghan Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DDL Severity:S4 (Feature request)
Version:5.1.50, 5.5.7 OS:Any
Assigned to: CPU Architecture:Any
Tags: create, fast, INDEX, innodb

[19 Oct 2010 21:12] Mark Callaghan
Description:
I want rebuild all secondary indexes on a table without specifying DDL for each index. It is much easier to run "alter table foo engine=innodb" for each table than to drop/add each index. However when I run "alter table foo engine=innodb" it does not appear to use fast index creation based on the space used by the table and reported in SHOW TABLE STATUS for the index.

How to repeat:
create table rt (i int primary key auto_increment, j float) engine=innodb;
insert into rt values (null, 1);
create index x2 on rt(j);

# run this enough times to put 2M rows in the table
insert into rt select null, rand(0) from rt;

# for me data_length ~62M and index_length ~50M
show table status;

alter table rt engine=innodb;

# for me data_length ~62M and index_length ~50M
show table status;

alter table rt drop index x2;
create index x2 on rt(j);

# for me data_length ~62M and index_length ~#30M
show table status;

Suggested fix:
Provide option to do fast index create in this case.
[19 Oct 2010 21:27] Mark Callaghan
OPTIMIZE TABLE FOO appears to be the same as "alter table foo engine=innodb" meaning it does not do fast index creation
[20 Oct 2010 4:06] Valeriy Kravchuk
Same story with 5.5:

mysql> show table status like 'rt'\G
*************************** 1. row ***************************
           Name: rt
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 1049050
 Avg_row_length: 30
    Data_length: 32047104
Max_data_length: 0
   Index_length: 25739264
      Data_free: 5242880
 Auto_increment: 1376221
    Create_time: 2010-10-20 06:26:32
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options: 
        Comment: 
1 row in set (0.01 sec)

mysql> alter table rt engine=innodb;
Query OK, 1048576 rows affected (52.62 sec)
Records: 1048576  Duplicates: 0  Warnings: 0

mysql> show table status like 'rt'\G
*************************** 1. row ***************************
           Name: rt
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 1049050
 Avg_row_length: 30
    Data_length: 32047104
Max_data_length: 0
   Index_length: 25739264
      Data_free: 5242880
 Auto_increment: 1376221
    Create_time: 2010-10-20 06:58:18
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options: 
        Comment: 
1 row in set (0.01 sec)

mysql> alter table rt drop index x2;
Query OK, 0 rows affected (0.57 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> create index x2 on rt(j);
Query OK, 0 rows affected (20.64 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show table status like 'rt'\G
*************************** 1. row ***************************
           Name: rt
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 1049050
 Avg_row_length: 30
    Data_length: 32047104
Max_data_length: 0
   Index_length: 16302080
      Data_free: 27262976
 Auto_increment: 1376221
    Create_time: 2010-10-20 07:00:04
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options: 
        Comment: 
1 row in set (0.01 sec)

mysql> select version();
+----------------+
| version()      |
+----------------+
| 5.5.7-rc-debug |
+----------------+
1 row in set (0.00 sec)
[1 Oct 2013 16:34] Arnaud Adant
I filed a similar FR that is now closed :

Bug 11872643 - ALTER TABLE ADD / DROP / MODIFY COLUMN PERFORMANCE

I retried in 5.6.14 using this data :

insert into rt values (null, 1);
insert into rt select null, rand(0) from rt;

replace into rt(i,j) select null, rand(0) from rt t1, rt t2, rt t3, rt t4, rt t5, rt t6, rt t7, rt t8, rt t9, rt t10, rt t11, rt t12, rt t13, rt t14, rt t15, rt t16, rt t17, rt t18, rt t19, rt t20, rt t21;

bare table without index :

-rw-rw---- 1 aadant common 62914560 Oct  1 17:38 data/test_rt/rt.ibd

after fast index creation : alter table rt add key x2(j);

-rw-rw---- 1 aadant common 96468992 Oct  1 17:40 data/test_rt/rt.ibd

after  table rebuild : alter table rt engine=InnoDB;

-rw-rw---- 1 aadant common 121634816 Oct  1 17:43 data/test_rt/rt.ibd

after row_format=Compact :  alter table rt row_format=Compact;

-rw-rw---- 1 aadant common 96468992 Oct  1 17:46 data/test_rt/rt.ibd

So it seems that the problem is still theee unless you use this workaround :

alter table rt row_format=Compact;

We only need to use this existing implementation for optimize and alter table engine=InnoDB in 5.6 ...
[1 Oct 2013 19:47] Mark Callaghan
What part of the manual to I need to revisit? Looks like fast index create is used for:
alter table foo row_format = ...

but not used for:
alter table foo engine=innodb;

and not used for:
alter table foo engine=innodb row_format=...
[2 Oct 2013 10:05] Arnaud Adant
>What part of the manual to I need to revisit? Looks like fast index create is >used for:
>alter table foo row_format = ...

http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html

"change row_format property" indicates that it is a full online operation.

Then this page gives details on secondary index creation :

http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-implementation.html

"The B-tree is then built in key-value order, which is more efficient than inserting rows into an index in random order. Because the B-tree nodes are split when they fill, building the index in this way results in a higher fill-factor for the index, making it more efficient for subsequent access."

This is also why innodb_sort_buffer_size matters.

Please note that there is a limitation that is well documented :

http://dev.mysql.com/doc/refman/5.7/en/innodb-create-index-limitations.html

"When running an online ALTER TABLE operation, the thread that runs the ALTER TABLE operation will apply an “online log” of DML operations that were run concurrently on the same table from other connection threads. When the DML operations are applied, it is possible to encounter a duplicate key entry error (ERROR 1062 (23000): Duplicate entry), even if the duplicate entry is only temporary and would be reverted by a later entry in the “online log”. This is similar to the idea of a foreign key constraint check in InnoDB in which constraints must hold during a transaction"

It can occur when the PK or an UK is modified during the online DDL.

>but not used for:
>alter table foo engine=innodb;
>and not used for:
>alter table foo engine=innodb row_format=...

Correct, those operations are not listed here :

http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html
[2 Oct 2013 10:22] Marko Mäkelä
This is not really an InnoDB bug. The attributes are evaluated outside InnoDB.

I just tested this in 5.6 (and it should be the same in 5.7):

create table t(a serial)engine=innodb;
alter table t engine=innodb, lock=none;
ERROR 1846 (0A000): LOCK=NONE is not supported. Reason: COPY algorithm requires a lock. Try LOCK=SHARED.

The method ha_innobase::check_if_supported_inplace_alter() is not even being invoked, so InnoDB is not getting any say in this.

In my opinion, there should be an additional Alter_inplace_info::HA_ALTER_FLAGS for specifying that we simply want to rebuild the table.
[2 Oct 2013 10:27] Arnaud Adant
Other ways to check if online DDL is used : using the lock option and checking the number of rows affected.

alter table rt engine=innodb, lock=none;
ERROR 1846 (0A000): LOCK=NONE is not supported. Reason: COPY algorithm requires a lock. Try LOCK=SHARED.

alter table rt engine=innodb row_format=Compact, lock=none;
ERROR 1846 (0A000): LOCK=NONE is not supported. Reason: COPY algorithm requires a lock. Try LOCK=SHARED.

alter table rt row_format=Compact, lock=none;
Query OK, 0 rows affected (47.25 sec)          <== 0 rows affected
Records: 0  Duplicates: 0  Warnings: 0

alter table rt engine=InnoDB, lock=exclusive;
Query OK, 2097156 rows affected (52.22 sec)
Records: 2097156  Duplicates: 0  Warnings: 0
[2 Oct 2013 17:47] Mark Callaghan
Thanks for even more details. But http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html says that inplace will be used when row_format is changed. In these cases row_format is specified but the new value is the same as the old value. Maybe the docs should be updated to make that clear.
[2 Oct 2013 18:50] Arnaud Adant
Mark,

Yes, you read correctly. The alter table row_format should be immediate according to the manual. Marko's comment is about that.

But ... since it actually does a rebuild, it makes sense to do it every time.
Just like alter table engine=InnoDB is useful even if the engine is already InnoDB. I raised the question internally and I will wait for the reply before filing a documentation bug.
[10 Feb 2014 11:54] Marko Mäkelä
Note: the WL#5534 (ALGORITHM=INPLACE) code reports "0 rows affected", no matter how many rows there are in the table, even when the storage engine is rebuilding the table. It is an easy way to check if ALGORITHM=COPY was used if you do not want to add an explicit ALGORITHM=INPLACE clause (or LOCK=NONE, which implies it).

However, if the table is empty, both ALGORITHM=COPY and ALGORITHM=INPLACE will report "0 rows affected".