Bug #17325 Node restart and 4 replicas fail (array index out of range)
Submitted: 11 Feb 2006 15:35 Modified: 14 Feb 2006 10:20
Reporter: Jonathan Miller Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:5.1.7 OS:Linux (Linux 32 Bit OS)
Assigned to: Tomas Ulin CPU Architecture:Any

[11 Feb 2006 15:35] Jonathan Miller
Description:
I had setup the following:
* 5 hosts
* 4 DN
* 1 MGM
* 4 Replicas

I started cluster and started a load of a 1 million row database. Shortly into the load I closed the network port on one of the DN (dn ID2). DN went down as expected and DN ID 3 took over. The load program continued. I open the port back up and restarted DN ID2. 

After DN ID 2 recovered I shut to port down again. DN ID2 wnet down as expected due to not being able to contact abitrator. DN (dn ID3) quickly failed also with the following message:

Node 3: Forced node shutdown completed. Initiated by signal 0. Caused by error 2304: 'Array index out of range(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

The other 2 data nodes stayed up and I was still able to access cluster data.

I restarted the 2 failed DN and then tried to recreate database. At which point I found that I could access data, but I could not drop the database:

mysql> use TPCB
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+----------------+
| Tables_in_TPCB |
+----------------+
| account        |
| branch         |
| history        |
| teller         |
| trans          |
+----------------+
5 rows in set (0.00 sec)

mysql> select * from account;
+-------+------+---------+---------------+
| aid   | bid  | balance | filler        |
+-------+------+---------+---------------+
|  2803 |  281 |    0.00 | Going fishing |
|   965 |   97 |    0.00 | Going fishing |
|  4332 |  434 |    0.00 | Going fishing |
|  1848 |  185 |    0.00 | Going fishing |
|  6286 |  629 |    0.00 | Going fishing |
|  6678 |  668 |    0.00 | Going fishing |
|  7519 |  752 |    0.00 | Going fishing |
|  7929 |  793 |    0.00 | Going fishing |
|  7940 |  794 |    0.00 | Going fishing |
|  7941 |  795 |    0.00 | Going fishing |
.
.
.
+-------+------+---------+---------------+
18222 rows in set (0.48 sec)

mysql> DROP DATABASE TPCB;
ERROR 1051 (42S02): Unknown table 'branch,teller,trans,history,account'
mysql>

Config.ini:

[DB DEFAULT]
NoOfReplicas: 4
IndexMemory: 500M
DataMemory: 1300M
BackupMemory: 64M
MaxNoOfLocalOperations: 300000
MaxNoOfTables: 200
StopOnError: 1
MaxNoOfConcurrentScans: 100
DataDir: /space/run
#DiskPageBufferMemory: 500M
DiskPageBufferMemory: 4M
MaxNoOfConcurrentOperations: 300000
NoOfFragmentLogFiles: 50

Node ID3: Error Log:
Time: Saturday 11 February 2006 - 15:53:48
Status: Temporary error, restart node
Message: Array index out of range (Internal error, programming error or missing error message, please report a bug)
Error: 2304
Error data: dbtc/DbtcMain.cpp
Error object: DBTC (Line: 6498) 0x0000000e
Program: /home/ndbdev/jmiller/builds/libexec/ndbd
Pid: 15155
Trace: /space/run/ndb_3_trace.log.1
Version: Version 5.1.7 (beta)
***EOM***

Node ID2: Error Log:
Time: Saturday 11 February 2006 - 15:52:35
Status: Temporary error, restart node
Message: Array index out of range (Internal error, programming error or missing error message, please report a bug)
Error: 2304
Error data: dbtc/DbtcMain.cpp
Error object: DBTC (Line: 6498) 0x0000000e
Program: /home/ndbdev/jmiller/builds/libexec/ndbd
Pid: 4464
Trace: /space/run/ndb_2_trace.log.2
Version: Version 5.1.7 (beta)
***EOM***

How to repeat:
Setup configuration as above. Start load on cluster and stop network port on one DN. 

I have not tried as of yet to repeat.
[13 Feb 2006 13:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/2517
[13 Feb 2006 14:21] Tomas Ulin
pushed into 5.0.19 and 5.1.7
bug exists in 4.1 as well, but will not fix it there
[14 Feb 2006 10:20] Jon Stephens
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Documented fix in 5.0.19 and 5.1.7 changelogs. Bug closed.