Bug #56285 ABORT BACKUP crashes the cluster
Submitted: 26 Aug 2010 10:39 Modified: 12 Oct 2010 15:39
Reporter: Valenti Jove Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:7.1.5, 7.0, 7.1 -bzr OS:Linux
Assigned to: Jonas Oreland CPU Architecture:Any

[26 Aug 2010 10:39] Valenti Jove
Description:
If you start a cluster native backup and abort it just after launched, the whole cluster goes down. If you wait a couple of minutes, the backup is aborted ok.

My configuration is a 6 data node cluster with 2 mysql servers, and 2 mgmt servers. Mysql version is 7.1.5, although it already happened with 7.1.4b.

How to repeat:
Start a native backup using: "START BACKUP id NOWAIT"
Just after starting the backup, do: "ABORT BACKUP id"

Suggested fix:
My guess is that there's a period of time after the start of the backup where aborting is not possible. Don't allow to abort it until it's safe to do it.
[26 Aug 2010 10:42] Valenti Jove
Log messages.

Attachment: abort.log (application/octet-stream, text), 14.31 KiB.

[26 Aug 2010 11:12] Jonas Oreland
ndb_error_reporter
[27 Aug 2010 19:42] Sveta Smirnova
Thank you for the report.

Verified almost as described.

Test case for MTR will be attached.
[27 Aug 2010 19:42] Sveta Smirnova
test case

Attachment: ndb_bug56285.test (application/octet-stream, text), 2.75 KiB.

[12 Oct 2010 14:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/120566

3848 Jonas Oreland	2010-10-12
      ndb - bug#56285 - send ABORT_BACKUP_ORD to correct block (iff using ndbmtd)
[12 Oct 2010 14:15] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.47-ndb-7.0.20 (revid:jonas@mysql.com-20101012141145-q8j8dsx26vd9m9sq) (version source revid:jonas@mysql.com-20101012141145-q8j8dsx26vd9m9sq) (merge vers: 5.1.47-ndb-7.0.20) (pib:21)
[12 Oct 2010 14:17] Jonas Oreland
pushed to 7.0.20 and 7.1.9

DOCS: abort backup was "missed" when introducing ndbmtd
[12 Oct 2010 15:39] Jon Stephens
Documented bugficx in the NDB-7.0.20 and 7.1.9 changelogs as follows:

      Aborting a native backup in the ndb_mgm client using ABORT 
      BACKUP did not work correctly when using ndbmtd, in some cases 
      leading to a crash of the cluster.

Closed.