Bug #17229 import procedure is slower in 5.0 by 50% than in 4.0
Submitted: 8 Feb 2006 9:59 Modified: 5 Oct 2007 22:24
Reporter: Vadim Tkachenko Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server Severity:S5 (Performance)
Version:4.0.26, 5.0.18 OS:Linux (RedHat AS 3.0)
Assigned to: Marc ALFF CPU Architecture:Any

[8 Feb 2006 9:59] Vadim Tkachenko
Description:
Importing big file in 4.0 takes 13m17s
time mysql test < importtab.sql
real    13m17.845s
user    0m28.470s
sys     0m6.940s    

in 5.0 19m59s:

time mysql test < importtab.sql
real    19m59.276s
user    1m10.720s
sys     0m7.330s

How to repeat:
1. create table:
CREATE TABLE IF NOT EXISTS `importtab` (
    `uID` int(11) AUTO_INCREMENT NOT NULL default '0',
    `rID` int(11) NOT NULL default '0',
    `aF` int(11) default NULL,
    PRIMARY KEY (uID, rID)); 
2. Fill table:
a) INSERT INTO importtab (rID, aF) VALUES (FLOOR((RAND() * 100)), FLOOR((RAND() * 100)));
b) INSERT INTO importtab (rID, aF) SELECT FLOOR((RAND() * 100)), aF FROM importtab;
repeat step b) 26 times to populate
 67108864 rows.
3. remove AUTO_INC
alter table importtab change uID uID  int(11) NOT NULL default '0';
4. dump table:
mysqldump -l --add-locks --extended-insert --quick --all --disable-keys test importtab > importtab.sql
5. import table in 4.0 and 5.0
time mysql test < importtab.sql

mysqld params:
--port=3306 \
--socket=/tmp/mysql.sock \
--user=root --max_connections=3000 \
--max_connect_errors=10 \
--table_cache=2048 \
--max_allowed_packet=1048576 \
--binlog_cache_size=1048576 \
--max_heap_table_size=67108864 \
--sort_buffer_size=512K \
--join_buffer_size=1048576 \
--key_buffer=1G \
--thread_cache=16 \
--thread_concurrency=16 \
--thread_stack=196K \
--query_cache_size=0 \
--ft_min_word_len=4 \
--default_table_type=MYISAM \
--transaction_isolation=REPEATABLE-READ \
--tmp_table_size=64M \
--skip-locking \
--server-id=1 \
--skip-innodb

I'll place importtab.sql on ftp, so you can just use it.
[5 Oct 2007 22:24] Marc ALFF
Please see bug#29921, which is another report about the same performance issue.

Normally, the earlier bug report should be kept open when duplicates are found.
However, significant work had already been done with
- bug#29921
- bug#30237
- bug#30333
- bug#30625
before the present report was identified as a duplicate,
so bug#17229 will be closed instead.

Most of the performance regression has already been fixed (see related bugs),
please retest with a current 5.0 release.

Please use bug#29921 for tracking / documenting current performances results.