Bug #10133 Emergency fast shutdown options for power loss
Submitted: 25 Apr 2005 3:59 Modified: 26 Apr 2005 21:43
Reporter: James Day Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: General Severity:S4 (Feature request)
Version:4,5 OS:Linux (Linux, all)
Assigned to: CPU Architecture:Any

[25 Apr 2005 3:59] James Day
Description:
When mains/utility power is lost and there is limited or uncertain battery and standby power available it is highy desirable to have very urgent shutdown commands to avoid data loss or tablespace corruption.

How to repeat:
Encounter loss of power and consider what you'd like to have been able to do to avoid corrupt database tables and many hours or days of rebuild time.

Suggested fix:
I'd like to see two capabilities:

1. Start an urgent shutdown now.
2. Shut down immediately (within one second or less, ideally).

1. Start an urgent shutdown now.

This alerts the database server that it is soon going to lose power. It should go read only, terminate all non-super queries, reject new non-super queries and immediately flush all data records to disk, starting with the most fragile engines (MyISAM before InnoDB, for example). If time is available it should complete index writes and a normal shutdown. Should also tell mysqld_safe that the server must not be restarted even if it appears to have crashed - I've seen mysqld_safe restart a server which crashed during a deliberate mysqladmin shutdown operation.

2. Shut down immediately.

The server must exit now, doing only the most desperate of tasks to avoid data loss. Corrupted indexes and InnoDB rollbacks are acceptable consequences and should be chosen instead of taking more time. Should also tell mysqld_safe that the server must not be restarted even if it appears to have crashed or is killed with kill -9. This option is likely to be followed by kill -9 within seconds at most, because that can be less bad than loss of power with a running database server.

Background

Power loss happens. With an uninterruptible power supply there can be some notice but the time line for getting a somewhat clean shutdown can be limited by the requirement in some places for an emergency power off (EPO) shutdown to protect human life and the data center from electrical faults. EPO normally requires immediate cutting of all power, including all UPS and generator power. It is possible to negotiate some few seconds of UPS uptime on some servers to reduce the great negative effects of instant power loss but the response must be very fast. Option 2 provides for this case.

If the power loss is not an EPO shutdown the safest route can be to immediately shut down one database slave, to maximise the chance that at least one recent copy of the database is available for an emergency restore approach after power is restored. Option 1 provides for this.

In practice, I'd use option 1 on one server, option 2 on another and try to keep service going with the rest until low battery power alarms tripped 1 and then 2 for the remaining servers.

There aren't any good options here in the EPO case. There's going to be trouble and the best which can be done is try to avoid major database corruption.

Making InnoDB and the other engines more resistant to corruption is also good but this can help to give them a better chance at survival.
[25 Apr 2005 8:16] James Day
Changed severity from S3 to S4.
[26 Apr 2005 21:43] Guilhem Bichot
Put in discussion with Mr Day (some of these features are in our TODO).