Bug #45646 ndb_restore accesses transporter without needed locks
Submitted: 22 Jun 2009 8:01 Modified: 24 Jun 2009 10:53
Reporter: Jonas Oreland Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.* OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any

[22 Jun 2009 8:01] Jonas Oreland
Description:
The ndb_restore tool when reporting progress
uses signals that go to the cluster log.

However, when sending these signals, the transported is accessed wo/ locks
which means that corrupted messages can be sent, iff ClusterMgr is sending
heartbeats at the same time.

This is a quite theoretical case...(but the new asserts for ATC fire)

How to repeat:
read code

Suggested fix:
use locks
[22 Jun 2009 8:24] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/76782

2948 Jonas Oreland	2009-06-22
      ndb - bug#45646
        Add a "has_lock" argument to send_event_report
[22 Jun 2009 9:13] Bugs System
Pushed into 5.1.35-ndb-7.0.7 (revid:jonas@mysql.com-20090622090741-pwen4ub432by62mr) (version source revid:jonas@mysql.com-20090622090741-pwen4ub432by62mr) (merge vers: 5.1.35-ndb-7.0.7) (pib:11)
[24 Jun 2009 10:53] Jon Stephens
Documented bugfix in the NDB-7.0.7 changelog as follows:

        The signals used by ndb_restore to send progress information
        about backups to the cluster log accessed the cluster
        transporter without using any locks. Because of this, it was
        theoretically possible that these signals could be interefered
        with by heartbeat signals if both were sent at the same time,
        causing the ndb_restore messages to be corrupted.
[28 Aug 2009 9:00] Jon Stephens
Also documented in NDB-6.3.26 changelog per Geert mail.