Bug #17077 MyISAM: controlling data files fragmentation
Submitted: 2 Feb 2006 17:47 Modified: 20 Jun 2008 11:39
Reporter: David Lombardi Email Updates:
Status: Verified Impact on me:
Category:MySQL Server: MyISAM storage engine Severity:S4 (Feature request)
Version:5.x OS:Any
Assigned to: CPU Architecture:Any

[2 Feb 2006 17:47] David Lombardi
Extend the table options for MyISAM tables so that it should be possible to define the file size on creation and on file expansion.
It should be possible to tailor such setting for both table files (MyD) and index files (MyI).
As an example, ORACLE table/index extents management clauses sport such facility, although they are used to control table fragmentation inside a tablespace/datafile, in MyISAM they could/should be used to directly control file sizes.

Also, MIN_ROWS and AVG_ROW_LENGTH could be used to optionally determine the file size on table creation.

For environments experiencing several INSERTs occurring on different tables, it is likely to have very fragmented table files (both MyD and MyI files), causing a performance drop when SELECT/UPDATE are issued on those tables.
Since MyISAM is a very fast engine, it is reasonable to employ it when real-time like performance are needed, however without a mean for controlling/limiting the data files fragmentation on disk, its speed could be seriously compromized.

David Lombardi

How to repeat:

Suggested fix:
[20 Jun 2008 6:19] MySQL Verification Team
thank you for a reasonable feature request.  workaround is to run OPTIMIZE TABLE
on those tables that are highly fragmented.  for more detailed information and testcase, please see the bug I filed: bug #36708
[20 Jun 2008 11:39] David Lombardi
Hello Shane,
I checked your bug report (#36708) and - although I did not try the test scripts - I think you are referring to a data/record fragmentation, for which OPTIMIZE RECORD may provide an answer.

However, my post referred to the filesystem level fragmentation. That is, the data can be laid out contiguously within the MYD file, but still be fragmented with respect to the physical location on the disk.

In our case, this is exactly what happens: we structured the database to avoid any logical fragmentation, but in doing that we incurred in a very critical files fragmentation.

Actually, both fragmentations are performance critical and MyISAM engine should provide some means to control them. My post just focused on the filesystem fragmentation because we were able to design around the data fragmentation and avoid it altogether, but in general applications can't be this lucky.

Again, the rationale for providing this feature is still the same: MyISAM is well suited for performance critical applications and fragmentation seriously impair the DB performance.

Now, controlling file fragmentation is relatively simple: file pre-allocation usually works well on dedicated DB servers.

Regarding data fragmentation, I checked the article http://forge.mysql.com/wiki/MySQL_Internals_MyISAM#MyISAM_Dynamic_Data_File_Layout, and - with much surprise - I realized that the whole design is devoted to disk space optimization sacrificing performance.
So this might be not simple at all, in particular because it MyISAM dynamic record management design is going the opposite way to keep the records together.