Bug #27339 cluster stuck start pahse 100 after cluster --initial restart
Submitted: 21 Mar 2007 15:43 Modified: 28 Feb 2008 11:15
Reporter: Oli Sennhauser Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:5.1.16 OS:Any
Assigned to: CPU Architecture:Any

[21 Mar 2007 15:43] Oli Sennhauser
Description:
My 4 node cluster (2 machines with 2 nodegroups) does not pass start phase 100 after rolling restart (after resize of memory and ndbd --initial).

How to repeat:
1. Created one 2 node cluster (on 2 machines)
2. Inserted some data (8 rows in test table)
3. Did backup (BACKUP)
4. shut down cluster
5. added 2 additional nodes (now 4 nodes on 2 machines).
6. started all 4 nodes with --initial
7. added the following schema (tables and indexes): http://www.shinguz.ch/MySQL/FoodMart.tar.bz2
8. loaded the data

--> Then I got an error message because of too little datamemory:

2007-03-21 15:26:40 [MgmSrvr] INFO     -- Node 10: Data usage is 99%(1022 32K pages of total 1024)
2007-03-21 15:26:40 [MgmSrvr] INFO     -- Node 10: Index usage is 15%(315 8K pages of total 2080)
2007-03-21 15:26:40 [MgmSrvr] INFO     -- Node 11: Data usage is 99%(1022 32K pages of total 1024)
2007-03-21 15:26:40 [MgmSrvr] INFO     -- Node 11: Index usage is 15%(315 8K pages of total 2080)
2007-03-21 15:26:40 [MgmSrvr] INFO     -- Node 12: Data usage is 100%(1024 32K pages of total 1024)
2007-03-21 15:26:40 [MgmSrvr] INFO     -- Node 12: Index usage is 15%(318 8K pages of total 2080)
2007-03-21 15:26:40 [MgmSrvr] INFO     -- Node 13: Data usage is 100%(1024 32K pages of total 1024)
2007-03-21 15:26:40 [MgmSrvr] INFO     -- Node 13: Index usage is 15%(318 8K pages of total 2080)

9. change of the parameter
10. stop mgmt node
11. start mgmt node
12. restart of node 1 (id=10)
13. restart of node 2 (id=11)
14. restart of node 3 (id=12)

--> did not pass start phase 100 after 5 minutes

15. tried to stop node 12

--> Got following error

ndb_mgm> 12 stop
Shutdown failed.
*  2002: Stop failed
*        Operation not allowed while nodes are starting or stopping.

16. killed ndbd process + angel process with kill -9
17. started with ndbd -c master --initial

--> hangs in start phase 100...

Seems to be similar to bug #19645: http://bugs.mysql.com/bug.php?id=19645

Suggested fix:
no idea!
[21 Mar 2007 15:59] Oli Sennhauser
After complete custer restart (shutdown) everything went fine...
[28 Jan 2008 11:15] Valeriy Kravchuk
Is this problem repeatable with 5.1.22?
[29 Feb 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".