Bug #8995 Extremely efficient code for "Repair with KeyCache"
Submitted: 6 Mar 2005 16:03 Modified: 6 Mar 2005 18:52
Reporter: Mike MySQLAB Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S3 (Non-critical)
Version:4.1.10 OS:Windows (Win XP)
Assigned to: Sergei Golubchik CPU Architecture:Any

[6 Mar 2005 16:03] Mike MySQLAB
Description:
I'm not sure if you want to log this as a bug or as a Feature Request.
But the code used to rebuild indexes with "Alter Table enable keys" is extremely inefficient to the point of being useless on large MyISAM tables (hundreds of millions of rows). 

Even with:
myisam_max_sort_file_size=100g
myisam_max_extra_sort_file_size=100g
myisam_sort_buffer_size=80M
key_buffer_size=64M
bulk_insert_buffer_size=32M

did not speed it up any. 

Running on an AMD 3500+ with multiple 7200RPM drives with MyISAM tables.

How to repeat:
Create a CSV text file with 500 million rows of data. 

1) Alter table x disable keys
2) Load Data Infile ...
3) Alter table x enable keys

Step 2 takes only 6 hours so I have no problem with that. Step 3 took over 49 hours before I killed it. For the first 7 hours step 3 used "repair with filesort" then for some reason it switched to "repair with keycache".

The table had 4 indexes:

Primary (Integer AutoInc, Not NULL);
Main_Index (Text17, Text2, Date1)
Date_Index (Date1)
X_Index (Text15, Text15, Float(8,4))

My complaint is I had to cancel the "alter table enable keys" after 49 hours which means with the 450 million rows it had previously loaded, the "alter table enable keys" was operating at less than 2500 keys/second, provided it had theoretically completed when I killed it. It could have gone on for another 49 hours, I'm not sure. But the efficiency of the code to rebuild indexes is definitely questionable. Why should it take over 8x longer to build the keys than to load the data? (49hrs/6hours=8 hours)

Suggested fix:
Look at ways of optimizing "Alter table enable keys" so rebuilding keys is faster for large tables. It needs to be at least 100x faster than it is now. (49hrs/100=30 minutes).

There is no way MySQL can handle tables with hundreds of millions of rows unless someone builds themselves a super server with 4-8 processors, and 8gb or more of RAM. even then it could take hours to rebuild the indexes.

The inefficient code used to rebuild indexes is producing a great deal of down time when maintenaning large tables. I would like to see if fixed.
[6 Mar 2005 18:52] Sergei Golubchik
You should avoid "Repair with keycache" - because as you noticed it's too slow for large tables.

If MySQL started with "Repair by sorting" and then switched to "Repair with keycache" it means the first repair method failed for some reason and MySQL falled back to the slow and safe one. In this case error log should contain an warning "Warning: Enabling keys got errno %d, retrying".

Knowing the error (most probably out of disk space) you should be able to fix it and ensure that MySQL always uses "repair by sorting"