Bug #71146 Manual does not explain when REPAIR or OPTIMIZE TABLE uses repair by sorting
Submitted: 14 Dec 2013 11:34 Modified: 16 Dec 2013 5:26
Reporter: Valeriy Kravchuk Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:any OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Tags: Optimize, Repair

[14 Dec 2013 11:34] Valeriy Kravchuk
Description:
OPTIMIZE (or REPAIR) table may work differently for MyISAM table while repairing it. It may do "Repair by sorting" (this is fast but uses more disk space in tmpdir and may not fix some problems) or "Repair with keycache" (that is slower but may fix more problems). Manual pages:

http://dev.mysql.com/doc/refman/5.5/en/repair-table.html
http://dev.mysql.com/doc/refman/5.5/en/optimize-table.html

do not clearly explain how the decision is made (it is somewhat explained for myisamchk though), it only gives some hints like:

"If you use the EXTENDED option, MySQL creates the index row by row instead of creating one index at a time with sorting. This type of repair is like that done by myisamchk --safe-recover. "

How to repeat:
Try to predict, using the manual, how OPTIMIZE or REPAIR TABLE will work based on the manual and current values of server variables.

Suggested fix:
I see the following function is used (it is applied for each key) to determine if sorting can be used:

static my_bool mi_too_big_key_for_sort(MI_KEYDEF *key, ha_rows rows)
{
  uint key_maxlength=key->maxlength;
  if (key->flag & HA_FULLTEXT)
  {
    uint ft_max_word_len_for_sort=FT_MAX_WORD_LEN_FOR_SORT*
                                  key->seg->charset->mbmaxlen;
    key_maxlength+=ft_max_word_len_for_sort-HA_FT_MAXBYTELEN;
  }
  return (key->flag & HA_SPATIAL) ||
          (key->flag & (HA_BINARY_PACK_KEY | HA_VAR_LENGTH_KEY | HA_FULLTEXT) &&
          ((ulonglong) rows * key_maxlength > myisam_max_temp_length));
}

I see a lot of details to document here, from the way key->maxlength is determined to where myisam_max_temp_length comes from etc.
[16 Dec 2013 5:26] MySQL Verification Team
Hello Valeriy,

Thank you for the bug report.

Thanks,
Umesh