Bug #24143 Heavy file fragmentation with multiple ndbd on single fs
Submitted: 9 Nov 2006 14:04 Modified: 7 Dec 2006 4:47
Reporter: Stewart Smith Email Updates:
Status: Closed Impact on me:
Category:MySQL Cluster: Disk Data Severity:S5 (Performance)
Version:5.1 OS:
Assigned to: Stewart Smith CPU Architecture:Any

[9 Nov 2006 14:04] Stewart Smith
heavy fragmentation of NDB disk data files (log files, data files) during mysql-test-run.pl (and normal use). At least with multiple data nodes on the one file system - possibly not a problem when each on different FS. Also probably not a problem with only 1 data node

How to repeat:
./mysql-test-run.pl --do-test=ndb_dd_basic

and run xfs_bmap on the files created

Suggested fix:
use xfsctl(XFS_IOC_RESVSP64) (wrapper around ioctl) to tell the file system to reserve an amount of space for the file. This will search the free extents btree that's ordered by size in the file system for the best fit (and not use the standard allocation algorithm, which gets it horribly wrong in this case).

This will only fix for XFS (on Linux or IRIX). Other file systems need to implement specific functionality for this.

posix_fallocate may do this (and we call that immediately after) but in current glibc, it does not - and you end up with fragmentation still.
[9 Nov 2006 14:08] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:


ChangeSet@1.2317, 2006-11-10 01:08:35+11:00, stewart@willster.(none) +2 -0
  BUG#24143 Heavy file fragmentation with multiple ndbd on single fs
  If we have the XFS headers (at build time) we can use XFS specific ioctls
  (once testing the file is on XFS) to better allocate space.
  This dramatically improves performance of mysql-test-run cases as well:
  number of extents for ndb_dd_basic tablespaces and log files
  BEFORE this patch: 57, 13, 212, 95, 17, 113 
  WITH this patch  :  ALL 1 or 2 extents
  (results are consistent over multiple runs. BEFORE always has several files
  with lots of extents).
  As for timing of test run:
  ndb_dd_basic                   [ pass ]         107727
  real    3m2.683s
  user    0m1.360s
  sys     0m1.192s
  ndb_dd_basic                   [ pass ]          70060
  real    2m30.822s
  user    0m1.220s
  sys     0m1.404s
  (results are again consistent over various runs)
  similar for other tests (BEFORE and AFTER):
  ndb_dd_alter                   [ pass ]         245360
  ndb_dd_alter                   [ pass ]         211632
[13 Nov 2006 1:18] Stewart Smith
approved by Jonas, pushed to 5.1-ndb
[7 Dec 2006 4:47] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at


Documented bugfix in 5.1.14 changelog.