Bug #37419 Table disappears with ALTER TABLE
Submitted: 16 Jun 2008 2:15 Modified: 17 Dec 2009 11:27
Reporter: Adam Dixon Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.2 OS:Any
Assigned to: Martin Skold CPU Architecture:Any
Tags: 6.2.15

[16 Jun 2008 2:15] Adam Dixon
Description:
Running ALTER Tables on a cluster can cause the table to disappear along with a number of different types of errors.

(every now and then);
ERROR 1412 (HY000) at line 2: Table definition has changed, please retry transaction

I eventually managed to get;
ERROR 1296 (HY000) at line 2: Got error 4243 'Index not found' from NDBCLUSTER

I also got this error in the error log, but all the time the tables were still 'there' and usable.
080606 13:51:26 [Warning] NDB_SHARE: ./test/t2 already exists use_count=25. Moving away for safety, but possible memleak.

Customer has also seen beyond the Index Not Found error, a 'Table does not exist' error.

See How to Repeat.

How to repeat:
/* Table Creation */
DROP TABLE if exists t2;
CREATE TABLE t2
(
page_id int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (page_id)
)
ENGINE= ndbcluster;

/* Default Values in table */
insert into t2 (page_id) values (1);
insert into t2 (page_id) values (2);
insert into t2 (page_id) values (3);
insert into t2 (page_id) values (4);
insert into t2 (page_id) values (5);
insert into t2 (page_id) values (6);

Run Process a), b) and c) simultaneously

Process a) on api host 1:

flush tables;
alter table t2 add test_column int(11) not null default '1';
alter table t2 add unique key indx_page_def_test (page_id, test_column);
alter table t2 add test_column2 int(11) not null default '1';
alter table t2 drop key indx_page_def_test;
alter table t2 add unique key indx_page_def_test (page_id, test_column, test_column2);

flush tables;
alter table t2 drop test_column;
alter table t2 drop key indx_page_def_test;
alter table t2 add test_column int(11) not null default '1';
alter table t2 drop test_column2;
alter table t2 add unique key indx_page_def_test (page_id, test_column);
alter table t2 add test_column2 int(11) not null default '1';
alter table t2 drop key indx_page_def_test;
alter table t2 add unique key indx_page_def_test (page_id, test_column, test_column2);

flush tables;
alter table t2 drop test_column;
alter table t2 drop key indx_page_def_test;
alter table t2 add test_column int(11) not null default '1';
alter table t2 drop test_column2;
alter table t2 add unique key indx_page_def_test (page_id, test_column);
alter table t2 add test_column2 int(11) not null default '1';
alter table t2 drop key indx_page_def_test;
alter table t2 add unique key indx_page_def_test (page_id, test_column, test_column2);

flush tables;
alter table t2 drop test_column;
alter table t2 drop key indx_page_def_test;
alter table t2 add test_column int(11) not null default '1';
alter table t2 drop test_column2;
alter table t2 add unique key indx_page_def_test (page_id, test_column);
alter table t2 add test_column2 int(11) not null default '1';
alter table t2 drop key indx_page_def_test;
alter table t2 add unique key indx_page_def_test (page_id, test_column, test_column2);

flush tables;
alter table t2 drop test_column;
alter table t2 drop key indx_page_def_test;
alter table t2 add test_column int(11) not null default '1';
alter table t2 drop test_column2;
alter table t2 add unique key indx_page_def_test (page_id, test_column);
alter table t2 add test_column2 int(11) not null default '1';
alter table t2 drop key indx_page_def_test;
alter table t2 add unique key indx_page_def_test (page_id, test_column, test_column2);

flush tables;
alter table t2 drop test_column;
alter table t2 drop key indx_page_def_test;
alter table t2 add test_column int(11) not null default '1';
alter table t2 drop test_column2;
alter table t2 add unique key indx_page_def_test (page_id, test_column);
alter table t2 add test_column2 int(11) not null default '1';
alter table t2 drop key indx_page_def_test;
alter table t2 add unique key indx_page_def_test (page_id, test_column, test_column2);

Process b) on api host 1 (repeatedly)

select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;

Process c) on api host 2 (repeatedly)

select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;
select * from t2;

Suggested fix:
In addition to any bug, document expected or potential errors expected to receive.
[26 Aug 2008 4:52] Feelora Gee
hi,guys, how this bug going on?
[28 Aug 2008 7:43] Martin Skold
This is because of interaction with the autodiscover functionality.
Table must be locked for DDL operations in all connected mysqlds
while alter table operation is executing.
[28 Aug 2008 7:48] Martin Skold
Workaround is to set cluster in single user mode during alter table
as documentation specifies:
http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-limitations-multiple-nodes.html
http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-single-user-mode.html
[6 Nov 2008 13:32] Feelora Gee
This problem will reproduce in single user mode.
[14 Nov 2008 11:33] Bernd Ocklin
Feelora Gee,

The above test case should fail in single user mode for host 1 since it will block traffic on machines b) and c). Do you have a test case for us that demonstrates table disappearing in single user mode?
[16 Nov 2008 23:00] Adam Dixon
Going by how Tomas has described usage of this above, online alter operations should not need single user mode. The test case does include a non-online operation in drop column. So this is not possible to run in this case.

I could not repeat this while adding only indexes/columns with null default.
[12 Aug 2009 15:59] jon stritar
I'm experiencing this bug as well. How do people recommend updating schemas while maintaining availability? Is this just not possible in a high traffic cluster?
[20 Aug 2009 10:06] Sveta Smirnova
Bug #39033 was marked as duplicate of this one.
[8 Oct 2009 7:28] Bernd Ocklin
Should be fixed in latest 6.2./6.3 versions with global schema lock. Can you re-test?
[3 Dec 2009 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[3 Dec 2009 7:46] Sveta Smirnova
Adam,

please re-test this bug.
[7 Dec 2009 4:02] Adam Dixon
Using 7.0.9 the only error I can get is on a 'selecting' api node the incorrect information in file error, it is a temporary error.

This test includes non-online operations, as I stated above.
[7 Dec 2009 7:32] Jonas Oreland
Hi Adam,

can you be be a bit elaborate on the result of testing a newer version.
especially, is there still a bug, and if yes, it it same as $subject
if yes, no, what is the new bug...

/Jonas
[7 Dec 2009 23:54] Adam Dixon
Hello Jonas,
I do not think we have a bug here. The test case also includes non-online operations, as well as the test case no longer produces the numerous other errors seen in the original report.

So I do not think we have a bug here.
[17 Dec 2009 11:27] Martin Skold
Closed due to inability to reproduce anymore.