Bug #13316 Create Index on Large Table crashes Cluster Storage Node
Submitted: 19 Sep 2005 9:05 Modified: 24 Sep 2005 3:30
Reporter: Stewart Burnett Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:4.1.14 OS:Various
Assigned to: Jonas Oreland CPU Architecture:Any

[19 Sep 2005 9:05] Stewart Burnett
Description:
After creating a table with > 100 columns, adding an additional index crashed the Cluster Storage Node. Each repeated attempt crashes the other storage node.

This behaviour has been seen on;

IBM X345 (i386) SLES9.1 MySQL Max 4.1.9 & 4.1.14
IBM P710 (Power PC) SLES9.1 MySQL Max 4.1.11
IBM P630 (Power PC) SLES9.1 MySQL Max 4.1.14

Storage node trace.log etc available.

config.ini (from X345 server) is

[NDBD DEFAULT]
NoOfReplicas=2

[MYSQLD DEFAULT]
[NDB_MGMD DEFAULT]
[TCP DEFAULT]

# Managment Server
[NDB_MGMD]
HostName=10.0.0.10
DataDir=/usr/local/mysql/cluster

# Storage Engines
[NDBD]
HostName=10.0.0.10
DataDir=/usr/local/mysql/cluster

[NDBD]
HostName=10.0.0.20
DataDir=/usr/local/mysql/cluster

[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]

Other servers use separate mgm server. P630 is a 2x8 node cluster.

How to repeat:
Create a large table;

create table test(
ai bigint auto_increment,
c001 int(11) not null,
c002 int(11) not null,
c003 int(11) not null,
c004 int(11) not null,
c005 int(11) not null,
c006 int(11) not null,
c007 int(11) not null,
c008 int(11) not null,
c009 int(11) not null,
c010 int(11) not null,
c011 int(11) not null,
c012 int(11) not null,
c013 int(11) not null,
c014 int(11) not null,
c015 int(11) not null,
c016 int(11) not null,
c017 int(11) not null,
c018 int(11) not null,
c019 int(11) not null,
c020 int(11) not null,
c021 int(11) not null,
c022 int(11) not null,
c023 int(11) not null,
c024 int(11) not null,
c025 int(11) not null,
c026 int(11) not null,
c027 int(11) not null,
c028 int(11) not null,
c029 int(11) not null,
c030 int(11) not null,
c031 int(11) not null,
c032 int(11) not null,
c033 int(11) not null,
c034 int(11) not null,
c035 int(11) not null,
c036 int(11) not null,
c037 int(11) not null,
c038 int(11) not null,
c039 int(11) not null,
c040 int(11) not null,
c041 int(11) not null,
c042 int(11) not null,
c043 int(11) not null,
c044 int(11) not null,
c045 int(11) not null,
c046 int(11) not null,
c047 int(11) not null,
c048 int(11) not null,
c049 int(11) not null,
c050 int(11) not null,
c051 int(11) not null,
c052 int(11) not null,
c053 int(11) not null,
c054 int(11) not null,
c055 int(11) not null,
c056 int(11) not null,
c057 int(11) not null,
c058 int(11) not null,
c059 int(11) not null,
c060 int(11) not null,
c061 int(11) not null,
c062 int(11) not null,
c063 int(11) not null,
c064 int(11) not null,
c065 int(11) not null,
c066 int(11) not null,
c067 int(11) not null,
c068 int(11) not null,
c069 int(11) not null,
c070 int(11) not null,
c071 int(11) not null,
c072 int(11) not null,
c073 int(11) not null,
c074 int(11) not null,
c075 int(11) not null,
c076 int(11) not null,
c077 int(11) not null,
c078 int(11) not null,
c079 int(11) not null,
c080 int(11) not null,
c081 int(11) not null,
c082 int(11) not null,
c083 int(11) not null,
c084 int(11) not null,
c085 int(11) not null,
c086 int(11) not null,
c087 int(11) not null,
c088 int(11) not null,
c089 int(11) not null,
c090 int(11) not null,
c091 int(11) not null,
c092 int(11) not null,
c093 int(11) not null,
c094 int(11) not null,
c095 int(11) not null,
c096 int(11) not null,
c097 int(11) not null,
c098 int(11) not null,
c099 int(11) not null,
c100 int(11) not null,
c101 int(11) not null,
c102 int(11) not null,
c103 int(11) not null,
c104 int(11) not null,
c105 int(11) not null,
c106 int(11) not null,
c107 int(11) not null,
c108 int(11) not null,
c109 int(11) not null,
primary key (ai),
unique key tx1 (c002, c003, c004, c005));

Then add an index;

create index tx2 
on test (
    c010,
    c011,
    c012,
    c013);

At this point one of the storage nodes will crash. Re-start the storage node, drop the table, re-create it and then create the index, this time the other storage node will fail.

Occasionally both storage nodes will fail, but I cannot reliably re-produce this.

Suggested fix:
There is a work around. If the index is created as part of the initial table creation, there does not appear to be a problem. Not so handy with 500k rows already in the table, if there is a work around for a populated table I'd be most grateful.
[19 Sep 2005 10:02] Hartmut Holzgraefe
verified, node error log showed the following entry

Date/Time: Monday 19 September 2005 - 11:44:59
Type of error: error
Message: Internal program error (failed ndbrequire)
Fault ID: 2341
Problem data: SimulatedBlock.cpp
Object of reference: DBDICT (Line: 346) 0x0000000c
ProgramName: libexec/ndbd
ProcessID: 3676
TraceFile: /usr/local/mysql-4.1.14/cluster/ndb_2_trace.log.1
Version 4.1.14
***EOM***
[22 Sep 2005 6:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/30186
[22 Sep 2005 7:06] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/30191
[22 Sep 2005 7:21] Jonas Oreland
Pushed into 4.1.15 and 5.0.14
[24 Sep 2005 3:30] Paul DuBois
Noted in 4.1.15, 5.0.14 changelogs.