Bug #48861 | Restoring backups aborts the ndbd's leaving processes 'hanging' | ||
---|---|---|---|
Submitted: | 18 Nov 2009 10:48 | Modified: | 6 Dec 2009 10:08 |
Reporter: | Geert Vanderkelen | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S3 (Non-critical) |
Version: | mysql-5.1.39-ndb-6.3.28b | OS: | Linux |
Assigned to: | Jonas Oreland | CPU Architecture: | Any |
Tags: | Backup, crash, ndbd, restore |
[18 Nov 2009 10:48]
Geert Vanderkelen
[18 Nov 2009 11:04]
Geert Vanderkelen
Actually happens just with a restore (obfuscating table name): _____________________________________________________ Processing data in table: ******/def/NDB$BLOB_5_4(6) fragment 0 Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 266: Time-out in NDB, probably caused by deadlock Temporary error: 4010: Node failure caused abort of transaction Unknown: 4009: Cluster Failure Cannot start transaction Verified using MySQL Cluster 6.3.28 (on Linux, debug build)
[18 Nov 2009 15:01]
Geert Vanderkelen
Same backup restores fine (except for some temporary redo buffer errors) in MySQL Cluster 7.0.9b.
[26 Nov 2009 12:10]
Geert Vanderkelen
ndb_restore parallelism problem. If you use -p 1 it restores just fine. Trying with -p 64, same restore fails again. Workaround: use "-p 1" when restoring
[30 Nov 2009 9:22]
Jonas Oreland
proposed patch
Attachment: bug48861.patch (application/octet-stream, text), 4.42 KiB.
[30 Nov 2009 10:52]
Geert Vanderkelen
Using Patch against 6.3.28 indeed fixes the problem.
[30 Nov 2009 11:07]
Jonas Oreland
Docs: When performing tasks generating lots of IO (such as using ndb_restore) an internal memory buffer could overflow, causing signal 6. The patch removes the internal buffer totally, since it's useless.
[30 Nov 2009 11:11]
Jonas Oreland
to be pushed to 6.2.19, 6.3.29 and 7.0.10
[30 Nov 2009 11:15]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/92054 3046 Jonas Oreland 2009-11-30 ndb - bug#48861 - remove memory channels internal storage (which can overflow) and link the request in a linked list instead
[30 Nov 2009 12:21]
Bugs System
Pushed into 5.1.39-ndb-6.3.29 (revid:jonas@mysql.com-20091130113057-vvfogdxcst814cjn) (version source revid:jonas@mysql.com-20091130113057-vvfogdxcst814cjn) (merge vers: 5.1.39-ndb-6.3.29) (pib:13)
[30 Nov 2009 12:21]
Bugs System
Pushed into 5.1.39-ndb-7.0.10 (revid:jonas@mysql.com-20091130115355-vmqycis77g5pd0yt) (version source revid:jonas@mysql.com-20091130115355-vmqycis77g5pd0yt) (merge vers: 5.1.39-ndb-7.0.10) (pib:13)
[30 Nov 2009 12:22]
Bugs System
Pushed into 5.1.39-ndb-7.1.0 (revid:jonas@mysql.com-20091130121550-lltariazkvytcjox) (version source revid:jonas@mysql.com-20091130121550-lltariazkvytcjox) (merge vers: 5.1.39-ndb-7.1.0) (pib:13)
[1 Dec 2009 13:02]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/92273 3167 Martin Skold 2009-12-01 [merge] Merge modified: storage/ndb/src/common/debugger/EventLogger.cpp storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp storage/ndb/src/kernel/blocks/ndbfs/AsyncIoThread.hpp storage/ndb/src/kernel/blocks/ndbfs/MemoryChannel.hpp storage/ndb/src/kernel/blocks/pgman.cpp storage/ndb/src/kernel/blocks/pgman.hpp storage/ndb/src/mgmsrv/MgmtSrvr.cpp storage/ndb/src/ndbapi/NdbOperationDefine.cpp storage/ndb/src/ndbapi/NdbOperationSearch.cpp storage/ndb/test/ndbapi/testBlobs.cpp storage/ndb/test/run-test/daily-basic-tests.txt storage/ndb/test/run-test/daily-devel-tests.txt
[1 Dec 2009 13:33]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/92279 3244 Martin Skold 2009-12-01 [merge] Merge modified: storage/ndb/src/common/debugger/EventLogger.cpp storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp storage/ndb/src/kernel/blocks/ndbfs/AsyncIoThread.hpp storage/ndb/src/kernel/blocks/ndbfs/MemoryChannel.hpp storage/ndb/src/kernel/blocks/pgman.cpp storage/ndb/src/kernel/blocks/pgman.hpp storage/ndb/src/mgmsrv/MgmtSrvr.cpp storage/ndb/src/ndbapi/NdbOperationDefine.cpp storage/ndb/src/ndbapi/NdbOperationSearch.cpp storage/ndb/test/ndbapi/testBlobs.cpp storage/ndb/test/run-test/daily-basic-tests.txt storage/ndb/test/run-test/daily-devel-tests.txt
[1 Dec 2009 14:02]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/92287 3170 Martin Skold 2009-12-01 [merge] Merge modified: storage/ndb/src/common/debugger/EventLogger.cpp storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp storage/ndb/src/kernel/blocks/ndbfs/AsyncFile.hpp storage/ndb/src/kernel/blocks/ndbfs/MemoryChannel.hpp storage/ndb/src/kernel/blocks/pgman.cpp storage/ndb/src/kernel/blocks/pgman.hpp storage/ndb/src/mgmsrv/MgmtSrvr.cpp storage/ndb/src/ndbapi/NdbOperationDefine.cpp storage/ndb/src/ndbapi/NdbOperationSearch.cpp storage/ndb/test/ndbapi/testBlobs.cpp storage/ndb/test/run-test/daily-basic-tests.txt storage/ndb/test/run-test/daily-devel-tests.txt
[1 Dec 2009 14:22]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/92291 3040 Martin Skold 2009-12-01 [merge] Merge modified: storage/ndb/src/kernel/blocks/ndbfs/AsyncFile.hpp storage/ndb/src/kernel/blocks/ndbfs/MemoryChannel.hpp storage/ndb/src/kernel/blocks/pgman.cpp storage/ndb/src/kernel/blocks/pgman.hpp storage/ndb/src/ndbapi/NdbOperationDefine.cpp storage/ndb/src/ndbapi/NdbOperationSearch.cpp storage/ndb/test/ndbapi/testBlobs.cpp storage/ndb/test/run-test/daily-basic-tests.txt
[6 Dec 2009 10:08]
Jon Stephens
Documented bugfix in the NDB-6.2.19, 6.3.29, and 7.0.10 changelogs, as follows: When performing tasks that generated large amounts of I/O (such as using ndb_restore), an internal memory buffer could overflow, causing data nodes to fail with signal 6. Subsequent analysis showed that this buffer was not actually required, so this fix removes it. Closed.