Bug #12118 cluster alter table data loss
Submitted: 22 Jul 2005 16:16 Modified: 25 Aug 2005 10:03
Reporter: Fabio Coatti Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:4.1.13 OS:Linux (Linux (AMD64))
Assigned to: Jonas Oreland CPU Architecture:Any

[22 Jul 2005 16:16] Fabio Coatti
Description:
I've tried do execute alter table on mysql cluster, but I've got data losses: only data inserted after alter table is kept through shutdown 

system: two dual amd64, both with storage and api node, i386 machine with mgm node.

The system and configuration is the same of bug 11776

How to repeat:
description on what I've done:
1) created ndbcluster table
2) insterted data into it
3) shutdown from mgm node
4) restarted cluster
5) all fine, data is OK
7) issued "alter table" adding varchar types to a table; the change is visible to both mysql api nodes that I'm using (after a first failed select on the second node, I've made the alter table on the first api node). No one else was using the cluster.
8) inserted rows using both nodes, all OK
9) shutdown/restart
10) only data inserted after issuing the alter table is present, the rows inserted in step 2 is completely lost.

I've repeated the same procedure, using a innodb/myisam as temporary table:
7a) issued alter table converting my table to innodb (and myisam)
8a) issued alter table adding the field as before
9a)  issued alter table converting my table back to ndbcluster

but the behaviuor is the same is before,
10a) inserted rows using both nodes, all OK
11a) shutdown/restart
12a) only data inserted after issuing the alter table is present, the rows inserted in step 2 is completely lost.

So if this is not a bug I wander how I can alter a table...
[22 Jul 2005 17:27] Jonathan Miller
Did you put the cluster into single user mode as required for doing alter table?
[23 Jul 2005 23:19] Fabio Coatti
To be honest no, I' havent seen this as a requirement reading the documentation; during tests I reckon that api node 2 was turned off (probably) and I'm sure was not used. Anyway, I've just repeated the tests issuing single user mode command, but the result is exactly the same: only rows inserted after alter table has survived, prior data get lost.

If you need some more testing, just drop me a note with all steps desired, no problems.
[2 Aug 2005 9:15] Martin Skold
Transfer to Jonas since he as already started on solving this problem.
[3 Aug 2005 12:32] Martin Skold
A schema upgrade should always be acompanied with
a backup both before and after the upgrade to ensure 
data is never lost. This is not a fix to the problem, but 
ensures that no data is lost at least.

The problem occurs if no LCP has been done between
the ALTER TABLE and the system restart.
A workaround is to ensure one never restarts cluster until
after next LCP (specified with TimeBetweenLocalCheckpoints) 
after the ALTER TABLE is completed.
[4 Aug 2005 8:14] Fabio Coatti
Is there any way to force a LCP? I understand that on a cluster with a normal workload the LCP can occur quite often, but there are situation where one should be sure to have all data safe, and this issue can wipe out all data in a table, if something goes wrong at the wrong time. Maybe the alter table is issued in a off-load timeframe and the LCP is not occurring in, say, hours...
[18 Aug 2005 12:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/28448
[22 Aug 2005 18:51] Jonas Oreland
Pushed into 4.1.15 and 5.0.12
[25 Aug 2005 10:03] Jon Stephens
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Bugfix documented in 4.1.15 and 5.0.13 changelogs. Closed.