Bug #56524 node crash with online alter table, traffic and adding columns
Submitted: 3 Sep 2010 5:26 Modified: 3 Sep 2010 16:25
Reporter: Jonas Oreland Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.3 OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any

[3 Sep 2010 5:26] Jonas Oreland
Description:
For each DML, datanode allocates a bitmask of changed columns (for various triggers).

This bitmask is allocated in 32-bit words.

If an online alter table add column, changes column so that the no of words
increase, and an transaction has performed DML prior to DDL,
and after DDL either
1) makes another DML
2) commits with replication turned on

The data node would crash, as it didn't consider that the bitmask of the "before-image" was smaller, than the current bitmask.

---

Note: This is a regression introduce in 6.3.11, 3 Apr 2008 by http://bugs.mysql.com/bug.php?id=35208 :(

How to repeat:
mysql_1> create table t1 (
  col0 int, col1 int, col2 int, col3 int, 
  col4 int, col5 int, col6 int, col7 int,
  col8 int, col9 int, col10 int, col11 int,
  col12 int, col13 int, col14 int, col15 int, 
  col16 int, col17 int, col18 int, col19 int, 
  col20 int, col21 int, col22 int, col23 int,
  col24 int, col25 int, col26 int, col27 int,
  col28 int, col29 int, col30 int, col31 int,
  primary key(col0)) engine = ndb;

mysql_2> begin;
insert into t1 (col0) values (1);

mysql_1> alter online table t1 add column col32 int COLUMN_FORMAT DYNAMIC;

mysql_2> commit; // crash

Suggested fix:
use stored size of bitmask, instead of no of columns in table currently
when accessing bitmask for before-image.
[3 Sep 2010 5:36] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/117482

3280 Jonas Oreland	2010-09-03
      ndb - bug#56524 - sizeof(CopyTuple::Changemask) might not be equal to cols, due to online add column
[3 Sep 2010 6:06] Bugs System
Pushed into mysql-5.1-telco-6.3 5.1.47-ndb-6.3.38 (revid:jonas@mysql.com-20100903053017-2rp1nn0d3ydj2nob) (version source revid:jonas@mysql.com-20100903053017-2rp1nn0d3ydj2nob) (merge vers: 5.1.47-ndb-6.3.38) (pib:21)
[3 Sep 2010 6:06] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.47-ndb-7.0.19 (revid:jonas@mysql.com-20100903053551-49rz7b9smjj9nq1u) (version source revid:jonas@mysql.com-20100903053551-49rz7b9smjj9nq1u) (merge vers: 5.1.47-ndb-7.0.19) (pib:21)
[3 Sep 2010 6:51] Jonas Oreland
pushed to 6.3.38, 7.0.19 and 7.1.8
[3 Sep 2010 16:25] Jon Stephens
Documented bugfix in the NDB-6.3.38, 7.0.19, and 7.1.8 changelogs, as follows:

        An online ALTER TABLE ADD COLUMN operation that changed the table
        schema such that the number of 32-bit words used for the bitmask
        allocated to each DML operation increased during a transaction in
        DML which was performed prior to DDL which was followed by either 
        another DML operation or--if using replication--a commit, led to 
        data node failure.

        This was because the data node did not take into account that the 
        bitmask for the before-image was smaller than the current bitmask, 
        which caused the node to crash.

Closed.