Bug #45768 GCOV testing hits assert: NdbBlob::preCommit assert(theState == Active);
Submitted: 25 Jun 2009 19:44 Modified: 28 Oct 2009 17:09
Reporter: Jonathan Miller Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-7.0 OS:Linux
Assigned to: Assigned Account CPU Architecture:Any
Tags: assert, core, GCOV

[25 Jun 2009 19:44] Jonathan Miller
Description:
While testing code coverage, the mtr test ndb.ndb_dd_alter hits and assert in the  NdbBlob::preCommit (this=0x13584820) at NdbBlob.cpp:2950

2950      assert(theState == Active);
(gdb) l
2945    {
2946      DBUG_ENTER("NdbBlob::preCommit");
2947      DBUG_PRINT("info", ("this=%p op=%p con=%p", this, theNdbOp, theNdbCon));
2948      if (theState == Invalid)
2949        DBUG_RETURN(-1);
2950      assert(theState == Active);
2951      assert(isKeyOp());
2952      if (isInsertOp() || isUpdateOp() || isWriteOp()) {
2953        if (theHeadInlineUpdateFlag) {
2954            // add an operation to update head+inline
2950      assert(theState == Active);
(gdb) l
2945    {
2946      DBUG_ENTER("NdbBlob::preCommit");
2947      DBUG_PRINT("info", ("this=%p op=%p con=%p", this, theNdbOp, theNdbCon));
2948      if (theState == Invalid)
2949        DBUG_RETURN(-1);
2950      assert(theState == Active);
2951      assert(isKeyOp());
2952      if (isInsertOp() || isUpdateOp() || isWriteOp()) {
2953        if (theHeadInlineUpdateFlag) {
2954            // add an operation to update head+inline

How to repeat:
1) Build using AMD-GCOV build script

2) ./mysql-test-run.pl --do-test=ndb_dd_alter

Suggested fix:
Should not assert
[25 Jun 2009 19:45] Jonathan Miller
Full output:

ndb.ndb_dd_alter                         [ fail ]
        Test ended at 2009-06-25 21:23:57

CURRENT_TEST: ndb.ndb_dd_alter
mysqltest: At line 223: query 'ALTER TABLE test.t1 ADD INDEX a2_i (a2), ADD INDEX a3_i (a3)' failed: 2013: Lost connection to MySQL server during query

The result from queries just before the failure was:
< snip >
1       2.2345  20000001        0       1       23457   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
2       3.2345  20000002        0       1       23458   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
3       4.2345  20000003        0       1       23459   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
4       5.2345  20000004        0       1       23460   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
5       6.2345  20000005        0       1       23461   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
6       7.2345  20000006        0       1       23462   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
7       8.2345  20000007        0       1       23463   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
8       9.2345  20000008        0       1       23464   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
9       10.2345 20000009        0       1       23465   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
10      11.2345 20000010        0       1       23466   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
11      12.2345 20000011        0       1       23467   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
12      13.2345 20000012        0       1       23468   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
13      14.2345 20000013        0       1       23469   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
14      15.2345 20000014        0       1       23470   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
15      16.2345 20000015        0       1       23471   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
16      17.2345 20000016        0       1       23472   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
17      18.2345 20000017        0       1       23473   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
18      19.2345 20000018        0       1       23474   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
19      20.2345 20000019        0       1       23475   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field
20      21.2345 20000020        0       1       23476   2006-01-01      07:04:00        1971-05-28 16:55:03    abc     abcdefg LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL       Text Field

More results from queries before failure can be found in /data0/ACRT_CLONES/builds/clone-mysql-5.1-telco-7.0-2009-06-19.7237/mysql-test/var/log/ndb_dd_alter.log

 - saving '/data0/ACRT_CLONES/builds/clone-mysql-5.1-telco-7.0-2009-06-19.7237/mysql-test/var/log/ndb.ndb_dd_alter/' to '/data0/ACRT_CLONES/builds/clone-mysql-5.1-telco-7.0-2009-06-19.7237/mysql-test/var/log/ndb.ndb_dd_alter/'
 - found 'core.16946' (0/3)

Trying 'dbx' to get a backtrace
[25 Jun 2009 19:48] Jonathan Miller
Back trace output from mtr

Attachment: test_bt.txt (text/plain), 21.27 KiB.

[20 Aug 2009 6:33] Jonas Oreland
see bug#41674
[25 Aug 2009 16:12] Frazer Clement
Reproduced same assertion.

Problem appears to be :
  1) Copying alter table exhausts space on Ndb when copying large blobs
     (Temp table is full message is seen in MySQLD error log)
  2) Error and Success path call to ha_enable_transaction() from copy_data_between_tables causes an attempt to commit the transaction.
  3) execute(Commit) path attempts to perform normal blob pre-commit steps, despite the error on the transaction.
  4) Blob pre-commit hits assertion failure due to unexpected state (Prepared rather than Active)

Now considering whether just NdbApi, or also ha_ndbcluster are at fault
[12 Oct 2009 16:13] Frazer Clement
Following appear to be duplicates : 
  Bug#45768
  Bug#41674
  Bug#34583
[16 Oct 2009 15:24] Frazer Clement
Proposed patch + testcases for 6.2

Attachment: bug45768.patch (text/x-patch), 8.14 KiB.

[16 Oct 2009 15:29] Frazer Clement
Patch is against 6.2

Patch requires fix to bug#48113 (Ndb error 1601 Out extents, tablespace full) to be applied first.

Bug#48040 (Ndb : TC trigger infinite loop in abort scenario) is exposed by the testcases here, but the testcase does not fail as a result of this testcase.
[23 Oct 2009 15:49] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/88004

3027 Frazer Clement	2009-10-23
      Bug#45768 : GCOV testing hits assert: NdbBlob::preCommit assert(theState == Active) 
      modified:
        mysql-test/suite/ndb/r/ndb_dd_alter.result
        mysql-test/suite/ndb/t/ndb_dd_alter.test
        storage/ndb/src/ndbapi/NdbBlob.cpp
        storage/ndb/test/ndbapi/testBlobs.cpp
        storage/ndb/test/run-test/daily-basic-tests.txt
[27 Oct 2009 10:45] Frazer Clement
Pushed to 
6.2.19
6.3.28
7.0.9
7.1.0
[28 Oct 2009 13:22] Jon Stephens
Documented bugfix in the NDB-6.2.19, 6.3.28, and 7.0.9 changelogs, as follows:

        When a copying ALTER TABLE operation exhausted the available
        space on a data node while copying large BLOB columns, this
        could lead to failure of the data node and a Table is full error
        on the SQL node which issued the ALTER TABLE statement. An
        example of such a statement would be an ALTER TABLE that changed
        an INT column to a BLOB column.

Closed.
[28 Oct 2009 17:09] Jon Stephens
This bug is a duplicate of BUG#34583.

Marked as Duplicate, tagged changelog entry with correct bug ID.
[29 Oct 2009 9:32] Jon Stephens
To account for BUG#48040 which was also found and fixed in the course of fixing this bug, updated changelog entry to read:

        When a copying operation exhausted the available
        space on a data node while copying large BLOB columns, this
        could lead to failure of the data node and a Table is full error
        on the SQL node which was executing the operation. Examples of 
        such operations could include an ALTER TABLE that changed an
        INT column to a BLOB column, or a bulk insert of BLOB data that 
        failed due to running out of space or to a duplicate key error.

Closed.