Bug #21047 DN restart during data inserts cause inserts to fail 'distribution changed"
Submitted: 13 Jul 2006 20:57 Modified: 2 Aug 2006 0:15
Reporter: Jonathan Miller Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:5.1.12 OS:Linux (Linux 32 Bit OS)
Assigned to: Jon Stephens CPU Architecture:Any

[13 Jul 2006 20:57] Jonathan Miller
Description:
Restarting a DN during data insert operations causes the inserts to fail with  
ERROR 1297 (HY000): Got temporary error 1204 'Temporary failure, distribution changed' from NDBCLUSTER

This can cause data to be missing.

How to repeat:
Start a cluster with 2 replicas and 2 data nodes.

create databae foo;
use foo;
mysql> create table foo1 (c1 int auto_increment key)engine=ndb;
mysql> delimiter | mysql> CREATE PROCEDURE dorepeat(p1 INT) BEGIN set @x=0; REPEAT insert into foo1 values(null); SET @x = @x + 1; UNTIL @x > p1 END REPEAT; END|
mysql> delimiter ; 
mysql> CALL dorepeat(100000); 

ndb_mgm -c host:port -e "2 restart"

ERROR 1297 (HY000): Got temporary error 1204 'Temporary failure, distribution changed' from NDBCLUSTER
mysql> select count(*) from foo1; +----------+
| count(*) |
+----------+
|    79681 |
+----------+
1 row in set (0.00 sec)

Suggested fix:
restarting data nodes should not affect ongoing processes.
[13 Jul 2006 21:45] Jonathan Miller
I have repeated this several times
[1 Aug 2006 14:37] Jonas Oreland
1204 is expected behavior during node restart
  (and can happen for insert/update/delete)
  this needs to be handled by application

(if your case, check for error after insert)
[2 Aug 2006 0:15] Jon Stephens
This is a long-documented limitation. See http://dev.mysql.com/doc/refman/5.0/en/mysql-cluster-limitations.html, under Transaction Handling:

[quote]
# Node Start, Stop, or Restart:: Starting, stopping, or restarting a node may give rise to temporary errors causing some transactions to fail. These include the following cases:

    * When first starting a node, it is possible that you may see Error 1204 Temporary failure, distribution changed and similar temporary errors.
    * The stopping or failure of any data node can result in a number of different node failure errors. (However, there should be no aborted transactions when performing a planned shutdown of the cluster.)

In either of these cases, any errors that are generated must be handled within the application. This should be done by retrying the transaction.
[/quote]
[24 Nov 2006 10:02] Roland Bouman
A restart and an update gives me this:

mysql> update ndbtemp set name = sha1(name) limit 40000;
ERROR 1297 (HY000): Got temporary error 499 'Scan take over error, restart scan transaction' from NDB

I am assuming this is the same or similar limitation?