Bug #24250 ALTER TABLE ADD PRIMARY KEY on InnoDB is so slow as to be unusable
Submitted: 13 Nov 2006 10:57 Modified: 4 Feb 2007 17:10
Reporter: Gisbert Selke (Basic Quality Contributor) Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.1.12 OS:Microsoft Windows (Win XP SP 2)
Assigned to: Assigned Account CPU Architecture:Any

[13 Nov 2006 10:57] Gisbert Selke
Description:
I have created an unindexed table of 1 million records (6 integer fields,
row length: 24 bytes) both with MyISAM and with InnoDB (details follow
below). I then added a primary key on one of the fields (key length: 4
bytes). The MyISAM table ALTER TABLE finished in 38 seconds, the InnoDB table took 8 hours 14 minutes 3 seconds. That is a factor of 780 more, and it makes
InnoDB tables practically unusable. I do appreciate that adding a
primary key involves several time-consuming tasks (checking uniqueness,
possibly redoredering records), and that InnoDB tables are more
complicated than MyISAM, but this does not explain the vast difference.

A mysqladmin PROCESSLIST showed that the system spends a great deal of
time in "copy to tmp table", although I have not watched this
systematically. According to the Windows Task Manager, the mysql daemon
used typically less than 10% of CPU and around 40 MB of memory.

(Needless to say, trying to insert the records into a table for which
indexes are already defined makes things even worse. I killed the insert
after more than 10 hours.)

As for the setup: This is a developer machine under Win XP SP2, 640 MB
physical memory. At the time of these tests, there were other processes
open, but they were not active and used under 1% of CPU. Also, there was
still more than 200 MB of free system memory, so swapping/thrashing
should not be the cause of the problem.

I know this looks like severity S5 (performance), but in fact it makes InnoDB tables unusable, so I think it is rather S1. -- This bug may be related to #24169.

How to repeat:
Here is the script together with the relevant timings:

-- *** Setup ***
use test;
-- cretae auxiliary table:
create table numbers (i int not null primary key);

-- procedure to fill auxiliary table:
drop procedure if exists numberfiller;
delimiter //
create procedure numberfiller(in n int)
begin
  declare i int;
  set i=1;
  truncate numbers;
  while i<=n do
    insert into test.numbers values(i);
    set i=i+1;
  end while;
end;
//
delimiter ;

-- fill the table: (timing is irreleveant)
call numberfiller(1000);

-- create a base table to aid in getting reproducible results:
drop table if exists tracking_base;
create table tracking_base (
  id int not null,
  day date not null,
  ad int not null,
  clicks int not null,
  impressions int not null,
  client int not null
) engine=myisam;

-- create the MyISAM version of the target table:
drop table if exists tracking_id_myisam;
create table tracking_id_myisam (
  id int not null,
  day date not null,
  ad int not null,
  clicks int not null,
  impressions int not null,
  client int not null
) engine=myisam;

-- create the InnoDB version of the target table:
drop table if exists tracking_id_inno;
create table tracking_id_inno (
  id int not null,
  day date not null,
  ad int not null,
  clicks int not null,
  impressions int not null,
  client int not null
) engine=innodb;

-- *** Here the timings start ***

mysql> insert into tracking_id_myisam   select * from tracking_base;
Query OK, 1000000 rows affected (4.39 sec)
Records: 1000000  Duplicates: 0  Warnings: 0

mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into tracking_id_inno     select * from tracking_base;
Query OK, 1000000 rows affected (3 min 4.10 sec)
Records: 1000000  Duplicates: 0  Warnings: 0

-- Note: the timings above are not "clean" due to buffering effects.

mysql> commit;
-- The commit returned immediately, however, the server still did some
-- background work as was clear from disk activity. Further tests
-- were started only after disk activity subsided.

-- *** Here is the timing for MyISAM ***
mysql> alter table tracking_id_myisam   add primary key (id);
Query OK, 1000000 rows affected (38.06 sec)
Records: 1000000  Duplicates: 0  Warnings: 0

-- *** Here is the timing for InnoDB ***
mysql> alter table tracking_id_inno     add primary key (id);
Query OK, 1000000 rows affected (8 hours 14 min 3.41 sec)
Records: 1000000  Duplicates: 0  Warnings: 0
-- *** End of timings ***

Just in case this is relevant, here's the server-related part of my
mysql config file, as reported by "my_print_defaults mysqld":
--port=3306
--basedir=c:/bin/MySQL/
--datadir=c:/bin/MySQL/Data/
--default-character-set=latin1
--default-storage-engine=MyISAM
--sql-mode=STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
--max_connections=20
--query_cache_size=8M
--table_cache=256
--tmp_table_size=32M
--thread_cache_size=8
--myisam_max_sort_file_size=100G
--myisam_max_extra_sort_file_size=100G
--myisam_sort_buffer_size=9M
--key_buffer_size=16M
--read_buffer_size=64K
--read_rnd_buffer_size=256K
--sort_buffer_size=2M
--innodb_data_file_path=ibdata1:10M:autoextend:max:4000M
--innodb_additional_mem_pool_size=2M
--innodb_flush_log_at_trx_commit=1
--innodb_log_buffer_size=1M
--innodb_buffer_pool_size=10M
--innodb_log_file_size=10M
--innodb_thread_concurrency=8
--innodb_file_per_table
--safe-user-create
--local-infile
--default-collation=latin1_german1_ci
[13 Nov 2006 15:47] Heikki Tuuri
Gisbert,

in MySQL-5.1.xx, certain engines support adding an index without rebuilding the entire table. Our developers are working on adding this feature to InnoDB.

Though, note that since the primary key is the clustering key of an InnoDB table, a full rebuild of of the table is required also in the future. For an InnoDB table you should decide the primary key early on, before inserting a lot of data into the table.

You should also check that your my.cnf is as recommended at http://dev.mysql.com/doc/refman/5.0/en/innodb-configuration.html 

I am changing this report as an InnoDB feature request.

Regards,

Heikki
[13 Nov 2006 16:28] Gisbert Selke
Sorry, but I don't think this is only a feature request. As my original posting says:

Needless to say, trying to insert the records into a table for which
indexes are already defined makes things even worse. I killed the insert
after more than 10 hours.

That means in practice that there is no way for me to get a table of 1 million records up in InnoDB, so this table type is completely unusable with 5.1.12 except for very small tables.

Please, note that there are other entries that indicate similar problems.
[14 Nov 2006 13:12] Heikki Tuuri
Gisbert,

you should also tune the InnoDB buffer pool size and InnoDB log file sizes according to the instructions in the manual.

Regards,

Heikki
[14 Nov 2006 13:45] Heikki Tuuri
Gisbert,

I populated tracking_base with rows where id increases from 1 to 1000000. I used the same buffer pool size and InnoDB log file size as you Gisbert.

I tested on a 2 GHz AMD Windows XP computer. In my computer InnoDB processed the ALTER in 36 seconds.

What could be the reason for bad performance in your test? Were the id's in a random order in your test?

Regards,

Heikki

C:\mysql-noinstall-5.1.12-beta-win32\mysql-5.1.12-beta-win32\bin>mysql -uroot te
st
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.1.12-beta-community-nt-log MySQL Community Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> grant all on all to 'ODBC'@'localhost';
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server version for the right syntax to use near 'all t
o 'ODBC'@'localhost'' at line 1
mysql> grant all on *.* to 'ODBC'@'localhost';
Query OK, 0 rows affected (0.09 sec)

mysql> drop table if exists tracking_id_inno;
Query OK, 0 rows affected, 1 warning (0.02 sec)

mysql> create table tracking_id_inno (
    ->   id int not null,
    ->   day date not null,
    ->   ad int not null,
    ->   clicks int not null,
    ->   impressions int not null,
    ->   client int not null
    -> ) engine=innodb;
Query OK, 0 rows affected (0.25 sec)

mysql>
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into tracking_id_inno     select * from tracking_base;
Query OK, 1000000 rows affected (34.97 sec)
Records: 1000000  Duplicates: 0  Warnings: 0

mysql> commit;
Query OK, 0 rows affected (0.09 sec)

mysql> alter table tracking_id_inno     add primary key (id);
Query OK, 1000000 rows affected (35.91 sec)
Records: 1000000  Duplicates: 0  Warnings: 0

mysql>
[14 Nov 2006 14:24] Gisbert Selke
Heikki, thank you for your quick feedback!
I will test again with tuned parameters (both from the manual ond hints we got from a MySQL performancae and optimizing workshop).
As for the resulkts: I just noticed that I forgot to post the part of the script that did the filling. I will augment later this week (I'll be out of office for two days). Sorry about this!
Your timing result looks marvellous. I would be more than happy to see this! (Btw, you *are* using 5.1.12?)
[14 Nov 2006 14:54] Heikki Tuuri
Gisbert,

yes, it is 5.1.12-beta-community-nt.

Note that if the id's are in a random order, then there will be lots of random disk I/O and the operation will be MUCH slower. Fortunately, usually generated id's are in an ascending order.

Regards,

Heikki
[22 Dec 2006 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[22 Dec 2006 11:20] Gisbert Selke
I will run the tests again with the current beta release, and on different machines.

Btw, I wouldn't count on primary keys coming sorted on input. That would assume a lot about the environment in which the database is used.
[5 Feb 2007 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".