| Bug #49263 | ndbd crashes with wrong error message when Undo Files path is invalid | ||
|---|---|---|---|
| Submitted: | 1 Dec 2009 14:23 | Modified: | 9 Dec 2009 15:04 |
| Reporter: | Geert Vanderkelen | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Cluster: Disk Data | Severity: | S3 (Non-critical) |
| Version: | mysql-5.1-telco-6.3 | OS: | Any |
| Assigned to: | Jonas Oreland | CPU Architecture: | Any |
| Tags: | crash, disk data, ndbd, undo | ||
[1 Dec 2009 14:49]
Geert Vanderkelen
Verified using 6.3bzr and 7.0bzr (pull from 20091201).
[8 Dec 2009 15:31]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/93223 3188 Jonas Oreland 2009-12-08 ndb - bug#49263 - reasonable error message when failing to recreate DD object during node/system restart
[8 Dec 2009 15:46]
Jonas Oreland
Pushed to 6.3.29 and 7.0.10
[9 Dec 2009 15:02]
Jon Stephens
Documented bugfix in the NDB-6.3.29 and 7.0.10 changelogs as follows:
When the FileSystemPathUndoFiles configuration parameter was set
to an non-existent path, the data nodes shut down with the
generic error 2341 (Internal program error). Now in such cases,
the error reported is error 2815 (File not found).
Closed.

Description: When the FileSystemPathUndoFiles is set to an non-existing path, the data nodes will exit with an bogus error message (and errno). The actual error can be read out of the traces. Time: Tuesday 1 December 2009 - 15:12:17 Status: Temporary error, restart node Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug) Error: 2341 Error data: dbdict/Dbdict.cpp Error object: DBDICT (Line: 3527) 0x0000000a Program: /data1/mysql/ndb-6.3bzr/libexec/ndbd Pid: 9657 Trace: /data2/users/geert/cluster/master/ndb_3_trace.log.1 Version: mysql-5.1.39 ndb-6.3.29-GA The trace contains this: --------------- Signal ---------------- r.bn: 250 "DBDICT", r.proc: 3, r.sigId: 217147 gsn: 717 "CREATE_FILE_REF" prio: 1 s.bn: 260 "LGMAN", s.proc: 3, s.sigId: 217145 length: 5 trace: 0 #sec: 0 fragInf: 0 H'00000003 H'01040003 H'000005e5 H'00000aff H'00000002 --------------- Signal ---------------- r.bn: 253 "NDBFS", r.proc: 3, r.sigId: 217146 gsn: 164 "CONTINUEB" prio: 1 s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 217144 length: 1 trace: 0 #sec: 0 fragInf: 0 Scanning the memory channel again with no delay --------------- Signal ---------------- r.bn: 260 "LGMAN", r.proc: 3, r.sigId: 217145 gsn: 260 "FSOPENREF" prio: 1 s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 217144 length: 4 trace: 0 #sec: 0 fragInf: 0 UserPointer: 565248 ErrorCode: 2815, File not found OS ErrorCode: 2 --------------- Signal ---------------- How to repeat: Basic cluster configuration with two data nodes (1 should be enough): [NDBD DEFAULT] Datadir=/var/lib/cluster NoOfReplicas=2 DataMemory=260M IndexMemory=30M FileSystemPathUndoFiles=/var/lib/cluster/UNDO Create log file and tablespace: CREATE LOGFILE GROUP lg_1 ADD UNDOFILE 'undo_1.log' INITIAL_SIZE 16M UNDO_BUFFER_SIZE 2M ENGINE NDBCLUSTER; CREATE TABLESPACE ts_1 ADD DATAFILE 'data_1.dat' USE LOGFILE GROUP lg_1 INITIAL_SIZE 32M ENGINE NDBCLUSTER; Shutdown cluster, alter config.ini changing this: FileSystemPathUndoFiles=/var/lib/cluster/UNDO_FOO Start ndb_mgmd, start ndbd and see it exit with an "Internal program error" Suggested fix: Exiting with an Error is fine, but it would be nicer at least the following error showing: ErrorCode: 2815, File not found Or even saying that the Undo path is incorrect?