Bug #24495 | "create index" gets slower the more indexes there are | ||
---|---|---|---|
Submitted: | 22 Nov 2006 3:37 | Modified: | 27 Sep 2008 10:42 |
Reporter: | P Eger | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: DDL | Severity: | S5 (Performance) |
Version: | 5.1.14-BK, 5.1.12-beta,5.0.24a | OS: | Linux (Linux, Windows Server 2003 x64) |
Assigned to: | CPU Architecture: | Any |
[22 Nov 2006 3:37]
P Eger
[22 Nov 2006 4:14]
P Eger
I have confirmed this behaviour in Mysql 5.0.24a, same OS as above. This bug actually makes it quicker to create the indexes before the data is inserted, which is massively counterintuitive!
[22 Nov 2006 12:45]
Valeriy Kravchuk
Thank you for a problem report. Verified almost as described with 5.1.14-BK on Linux: openxs@suse:~/dbs/5.1> bin/mysql -uroot test Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.1.14-beta-debug Source distribution Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> create table t401(c0 BINARY(16) not null,c1 int,c2 int,c3 BINARY(16),c4 -> smallint,c5 tinyint unsigned,c6 smallint,c7 int,c8 smallint,c9 tinyint -> unsigned,c10 tinyint unsigned,c11 tinyint unsigned,c12 int,c13 tinyint -> unsigned,c14 varchar(7),c15 int,c16 int,c17 BINARY(16),c18 tinyint unsig ned,c19 -> int,c20 int,c21 smallint,c22 varbinary(256) COMMENT 'C_GENERIC_URL',c23 -> smallint,c24 int,c25 smallint,c26 varchar(17),c27 tinyint unsigned,c28 i nt,c29 -> varbinary(256) COMMENT 'C_GENERIC_URL',c30 tinyint unsigned,c31 varchar( 20),c32 -> varbinary(51) COMMENT 'C_GENERIC_TEXT',c33 smallint,c34 varchar(50)) ENG INE = -> MyISAM; Query OK, 0 rows affected (0.02 sec) mysql> alter table t401 modify c0 int not null auto_increment primary key; Query OK, 0 rows affected (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> insert into t401(c1, c2, c4) values (rand(), rand(), rand()); Query OK, 1 row affected (0.00 sec) mysql> insert into t401(c1, c2, c4) select rand(), rand(), rand() from t401; Query OK, 1 row affected (0.01 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql> insert into t401(c1, c2, c4) select rand(), rand(), rand() from t401; Query OK, 2 rows affected (0.01 sec) Records: 2 Duplicates: 0 Warnings: 0 ... mysql> insert into t401(c1, c2, c4) select rand(), rand(), rand() from t401; Query OK, 131072 rows affected (4.07 sec) Records: 131072 Duplicates: 0 Warnings: 0 mysql> create index t401_i01 on t401(c1); Query OK, 262144 rows affected (10.13 sec) Records: 262144 Duplicates: 0 Warnings: 0 mysql> create index t401_i02 on t401(c2); Query OK, 262144 rows affected (11.92 sec) Records: 262144 Duplicates: 0 Warnings: 0 mysql> create index t401_i04 on t401(c4); Query OK, 262144 rows affected (13.98 sec) Records: 262144 Duplicates: 0 Warnings: 0 So, yes, each next index is created slower than previous one. It may be expected behaviour, because (see http://dev.mysql.com/doc/refman/5.1/en/create-index.html) table is really copied when CREATE INDEX is executed (as CREATE INDEX is mapped to ALTER TABLE). But, in any case, it is counter-intuitive and different from other RDBMSes, so, I still think it is a performance problem to be eventually solved (and/or explicitely documented).
[23 Nov 2006 1:44]
P Eger
Glad you can reproduce. This is a pretty big performance limiter for recreating a database/table from scratch. The sequence of 1) create new table 2) insert data bulk 3) create all indexes, is pretty standard method of bulk loading data into a table optimally. Can you recommend another method to eliminate the overhead of maintaining the indexes on every IUD? "load data infile" is a possible option, but may be difficult/impossible since we are accessing through jdbc and this would require shell access. Also, http://dev.mysql.com/doc/refman/5.0/en/insert-speed.html seems to imply that "myisamchk -rq /path/to/db/tbl_name" will recreate all indexes fast after doing a "myisamchk --keys-used=0 -rq". Is this implemented different/faster in myisamchk then in mysqld? Again, shelling to external scripts would be very sub-optimal for us (complexity, portability, error handling, etc), but if that is the only option i would give it a shot.
[28 Nov 2006 13:31]
Sergei Golubchik
The workaround is adding all indexes in one statement: ALTER TABLE xxx ADD INDEX ..., ADD INDEX ..., ADD INDEX ...
[28 Nov 2006 18:45]
Mark Matthews
Just a quick comment "LOAD DATA [LOCAL] INFILE" works fine with the JDBC driver, and in fact is a very efficient way of bulk loading tables, compared to crafting the INSERTs yourself.
[6 Dec 2006 3:56]
P Eger
Thanks guys, i have implemented both of these suggestions to good effect. alter table with a comma separated list does the trick for index creation & load data infile is far superior to even properly batched insert() statements. Still don't like the slow create index though ;-)
[27 Sep 2008 10:41]
Konstantin Osipov
This can only be fixed if the engine supports online add/drop index. MyISAM does support that, other engines move in that direction. There is no server bug, since the framework is in place.
[27 Sep 2008 10:42]
Konstantin Osipov
Please file a feature request against the engine you use to support online add/drop index. The server framework is in place starting from 5.1