Bug #10948 Crash with ALTER/DROP table for NDB tables (HPUX and PPC32)
Submitted: 30 May 2005 1:16 Modified: 13 Jun 2005 17:46
Reporter: Stewart Smith Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:4.1, 5.0, 5.1, 5.1-2325 tree OS:HP/UX (HPUX 32bit, PPC 32bit)
Assigned to: Stewart Smith CPU Architecture:Any

[30 May 2005 1:16] Stewart Smith
Description:
mysqld can segfault during ALTER/DROP table. This seems to be a race condition as it seems to be timing dependent and is currently only reproducable on my laptop.

How to repeat:
pasting the following into a mysql client will generate errors, and later a crash.

use test;
CREATE TABLE t1 (
  a INT NOT NULL,
  b INT NOT NULL
) ENGINE=ndbcluster;
CREATE TABLE t1 (   a INT NOT NULL,   b INT NOT NULL ) ENGINE=ndbcluster;
INSERT INTO t1 VALUES (9410,9412);
ALTER TABLE t1 ADD COLUMN c int not null;
SELECT * FROM t1;
DROP TABLE t1;
CREATE DATABASE mysqltest;
USE mysqltest;
CREATE TABLE t1 (
  a INT NOT NULL,
  b INT NOT NULL
) ENGINE=ndbcluster;
CREATE TABLE t1 (   a INT NOT NULL,   b INT NOT NULL ) ENGINE=ndbcluster;
RENAME TABLE t1 TO test.t1;
SHOW TABLES;
DROP DATABASE mysqltest;
USE test;
SHOW TABLES;
DROP TABLE t1;

Suggested fix:
don't crash.
[30 May 2005 1:17] Stewart Smith
Have observed multiple times.

Currently rebuilding latest pull to attempt to reproduce there
[30 May 2005 7:51] Stewart Smith
could reproduce crash at the first ALTER statement.

Appears to be the code in ha_ndbcluster::rename_table is not checking for the (currently unsupported) case of blob/no primary key tables and creating the event anyway. When we get an event, we then crash. i think.

I was unable to crash mysqld with this check added.

Am currently testing and will produce a sane looking patch.
[31 May 2005 3:02] Stewart Smith
This is possibly a duplicate of 9826 (which is a problem in the 5.0 tree)
[2 Jun 2005 16:16] Stewart Smith
Appears to be because of heap corruption rather than a race condition.
[3 Jun 2005 7:37] Stewart Smith
This was caused by heap corruption.

Thanks to a bit of time, valgrind and thinking, i've found the problem.

in storage/ndb/src/ndbapi/DictCache.cpp, we have:

 Ndb_local_table_info *
 Ndb_local_table_info::create(NdbTableImpl *table_impl, Uint32 sz)
 {
  Uint32 tot_size= sizeof(NdbTableImpl *) + ((sz+7) & ~7); // round to Uint64
  void *data= malloc(tot_size);
  if (data == 0)
    return 0;
  memset(data, 0, tot_size);
  new (data) Ndb_local_table_info(table_impl);
  return (Ndb_local_table_info *) data;
}

This (is meant to) allocate enough memory for the Ndb_local_table_info plus an arbitrary size for local info. In our case, we have (from ha_ndbcluster.cc):

struct Ndb_local_table_statistics {
  int no_uncommitted_rows_count;
  ulong last_count;
  ha_rows records;
};

But the tot_size calculation (in ::create) is incorrect. It asks for sizeof(NdbTableImpl *) and not  sizeof(Ndb_local_table_info). This is a difference of 8bytes (on 32bit machines) due to the definition of Ndb_local_table info:

class Ndb_local_table_info {
public:
  static Ndb_local_table_info *create(NdbTableImpl *table_impl, Uint32 sz=0);
  static void destroy(Ndb_local_table_info *);
  NdbTableImpl *m_table_impl;
  Uint64 m_local_data[1];
private:
  Ndb_local_table_info(NdbTableImpl *table_impl);
  ~Ndb_local_table_info();
};

However, since m_local_data is going to point to the local data (that's the extra sz allocated in ::create), you would think this would be okay - as the "real" size of Ndb_local_table_info is just sizeof(NdbTableImpl*).

On PPC at least, you're wrong.

This simple C program:
#include <stdio.h>

struct a {
void *a;
unsigned long long b[1];
};

int main()
{ printf("%u\n",sizeof(struct a)); }

Will output 16 on PPC and 12 on x86. It appears that on PPC we are getting things aligned to 64bits (possibly due to the 64bit sized member, i'm not sure).

See http://www-128.ibm.com/developerworks/power/library/pa-dalign/ for an explanation of PPC alignment issues.

so a malloc(sizeof(void*)+extra_bits) is wrong.
with a malloc(sizeof(struct a)+N) you wast sizeof(long long), but it's better than crashing
or, you can do malloc(sizeof(struct a)-sizeof(long long)+N)

The GCC flag -Wpadded would have generated a warning for this. It is possible that it could be a good idea to enable this warning.

Patch to come.
[3 Jun 2005 9:08] Stewart Smith
Problem also exists in 4.1 tree (and 5.0, 5.1). Quite possibly has been causing problems elsewhere (HPUX is notable, see BUG#9826).
[4 Jun 2005 8:14] Stewart Smith
Martin has informed me that Sparc is not affected. I don't think any other platforms would have this problem.
[4 Jun 2005 8:25] Stewart Smith
Reviewed by Martin and Tomas.
[7 Jun 2005 2:50] Stewart Smith
pushed to 4.1 and 5.0.
[13 Jun 2005 17:46] Jon Stephens
Please do not submit the same bug more than once. An existing
bug report already describes this very problem. Even if you feel
that your issue is somewhat different, the resolution is likely
to be the same. Because of this, we hope you add your comments
to the original bug instead.

Thank you for your interest in MySQL.

Additional info:

Duplicate of Bug#9826 (fixed/documented/closed).