Bug #50723 InnoDB CHECK TABLE fatal semaphore wait timeout possibly too short for big table
Submitted: 29 Jan 2010 10:01 Modified: 15 May 2012 21:43
Reporter: Oli Sennhauser Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:5.0, 5.1 OS:Any
Assigned to: Marko Mäkelä CPU Architecture:Any
Tags: big table, CHECK TABLE, fatal semaphore wait timeout, innodb
Triage: Triaged: D1 (Critical) / R3 (Medium) / E4 (High)

[29 Jan 2010 10:01] Oli Sennhauser
Description:
I found that in file row/row0mysql.c in function row_check_table_for_mysql there is the fatal lock wait timeout for the CHECK TABLE command hard coded to 2 hours:

/* Enlarge the fatal lock wait timeout during CHECK TABLE. */
mutex_enter(&kernel_mutex);
srv_fatal_semaphore_wait_threshold += 7200; /* 2 hours */
mutex_exit(&kernel_mutex);

My calculations say, that InnoDB tables with a size of 200 to 350 Gbyte cannot be checked any more with this command:

7200 s * 50 Mbyte/s = 351 Gbyte
7200 s * 30 Mybte/s = 211 Gbyte

We have already customers coming close to these values...

How to repeat:
Only theoretical value. Did not try it out.

Suggested fix:
We have 2 suggestions:

Better one:
<axel> oli: I'd rather request that check table will *not* block any important mutexes, so there is no need to enlarge srv_fatal_semaphore_wait_threshold at all

Worse one:
Make srv_fatal_semaphore_wait_threshold configurable and do NOT hard code it there.
[22 Oct 2010 15:46] Shane Bester
Perhaps the fix for bug #55716 will help this case ? aka, can check bigger tables before hitting the long semaphore wait..
[8 Mar 2011 2:39] James Day
Bug #56855 is similar, watchdog thread timeout during OPTIMIZE TABLE.
[15 May 2012 21:43] John Russell
Added to changelog for 5.1.64, 5.5.25, 5.6.6: 

The CHECK TABLE statement could fail for a large InnoDB table due to
a timeout value of 2 hours. For typical storage devices, the issue
could occur for tables that exceeded approximately 200 or 350 GB,
depending on I/O speed. The fix relaxes the locking performed on the
table being checked, which makes the timeout less likely. It also
makes InnoDB recognize the syntax CHECK TABLE QUICK, which avoids the
possibility of the timeout entirely.