Description:
This is similar to bug #779, but for version 4.0.15.
I am following the advice of the manual when bulk loading a large table (I have 108 2 GB text files to load) by turning off index builds before LOAD DATA INFILE. The load ends normally (124 GB and 24+ hours later). The mysamchk build of the indices fails:
myisamchk: warning: Can't change size of indexfile, error: 22
myisamchk: error: 22 for record at pos 121201294124
myISAM-table 'mytable' is not fixed because of errors
It processes the table to the end (according to the incrementing row counter) and then immediately after fails with Error 22. Trying to "Fix" the table following the advice of the error message just repeats the Error 22 problem after many more hours of processing.
The command I'm issuing is:
myisamchk -O sort_buffer_size=512M -O key_buffer_size=768M -O read_buffer_size=512M -O write_buffer_size=512M -rq d:\mysql\data\mydb\mytable
I have 4GB of memory on the machine.
If I run the LOAD DATA INFILE command without shutting down the index builds, the load completes successfully, but takes 4+ DAYS. The resulting file sizes are:
mytable.myd 124,650,416,636
mytable.myi 53,546,828,800
Once loaded, the table works beautifully for me, but I'd like to decrease the load times.
How to repeat:
This problem only occurs with VERY large tables and indexes. I have a very wide (100+ columns, 15+ columns indexed) denormalized table with 140,000,000 rows. The data is imported from another application, which creates 100+ 2GB tab delimited files.
After I shut down the index builds with myisamchk, following the directions in the manual, the database load completes successfully.
Issuing the mysiamchk command to rebuild the indices results in Error 22 documented above. Note: This process works fine for me with smaller tables - it is only with very large tables that I encounter it. I have had the problem quite consistently on two different Windows 2003 Server networks, with different table structures (though always large volume, wide tables with lots of indices).
Suggested fix:
I would love to hear of one. The only thing that works for me is to load the tables while building the indices concurrently, but this is very time-consuming.