Bug #20121 Missing error message for out of diskspace
Submitted: 29 May 2006 10:42 Modified: 26 Feb 2007 1:38
Reporter: Serge Kozlov Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:5.1 OS:Linux (FC4)
Assigned to: Stewart Smith CPU Architecture:Any

[29 May 2006 10:42] Serge Kozlov
Description:
The drive has too small free space (in my case is 50MB) and an application try to insert a data. When whole disk will fill out then both data nodes fail with following error:

Current byte-offset of file-pointer is: 568

Time: Monday 29 May 2006 - 12:32:22
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming
 error or missing error message, please report a bug)
Error: 2341
Error data: dblqh/DblqhMain.cpp
Error object: DBLQH (Line: 11172) 0x0000000e
Program: /home/ndbdev/skozlov/builds/libexec/ndbd
Pid: 11081
Trace: /space/run/ndb_2_trace.log.1
Version: Version 5.1.11 (beta)
***EOM***

How to repeat:
1. Do free space on disk  for FileSystemPath is 50M.
2. Start a cluster
3. Run ./load_tpcb.pl ndb16 3306 root BLANK ndb
4. Wait while the script will stop with error: 
Loading accounts table -- Please wait
DBD::mysql::st execute failed: Got temporary error 4010 'Node failure caused abo
rt of transaction' from NDBCLUSTER at ./load_tpcb.pl line 400.
insert into account Error: Got temporary error 4010 'Node failure caused abort o
f transaction' from NDBCLUSTER at ./load_tpcb.pl line 400.
5. Look at error_log.
[29 May 2006 10:47] Serge Kozlov
trace, log files and config.ini

Attachment: bug20121.tar.gz (application/gzip, text), 82.99 KiB.

[2 Nov 2006 6:06] Stewart Smith
haven't really tested, but this should make the error nicer:

===== storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp 1.125 vs edited =====
--- 1.125/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp     2006-09-26 22:08:18 +10:00
+++ edited/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp    2006-11-02 17:04:13 +11:00

@@ -11342,7 +11342,13 @@

 void Dblqh::execBACKUP_FRAGMENT_REF(Signal* signal)
 {
-  ndbrequire(false);
+  BackupFragmentRef * ref= (BackupFragmentRef*)signal->getDataPtr();
+  char buf[255];
+  BaseString::snprintf(buf, sizeof(buf),
+                       "Unable to store fragment during LCP.");
+  progError(__LINE__,
+            ref->errorCode,
+            buf);
 }

 void Dblqh::execBACKUP_FRAGMENT_CONF(Signal* signal)
[7 Feb 2007 3:35] Stewart Smith
The NDBFS error codes returned in FSREF (pasted below) are mostly NDBD_EXIT codes..

and in BACKUP, the filePtr.p->errorCode is set from the FSAPPENDREF error, which, in turn is passed back to the caller through setting the errorCode in BACKUP_FRAGMENT_REF from the file error code.

So this should give us the correct NDBD exit code for the appropriate error (i.e. i think this patch is correct)

  enum NdbfsErrorCodeType {
    fsErrNone=0,
    fsErrEnvironmentError=NDBD_EXIT_AFS_ENVIRONMENT,
    fsErrTemporaryNotAccessible=NDBD_EXIT_AFS_TEMP_NO_ACCESS,
    fsErrNoSpaceLeftOnDevice=NDBD_EXIT_AFS_DISK_FULL,
    fsErrPermissionDenied=NDBD_EXIT_AFS_PERMISSION_DENIED,
    fsErrInvalidParameters=NDBD_EXIT_AFS_INVALID_PARAM,
    fsErrUnknown=NDBD_EXIT_AFS_UNKNOWN,
    fsErrNoMoreResources=NDBD_EXIT_AFS_NO_MORE_RESOURCES,
    fsErrFileDoesNotExist=NDBD_EXIT_AFS_NO_SUCH_FILE,
    fsErrReadUnderflow = NDBD_EXIT_AFS_READ_UNDERFLOW,
    fsErrFileExists = FS_ERR_BIT |  12,
    fsErrInvalidFileSize = FS_ERR_BIT |  13,
    fsErrOutOfMemory = FS_ERR_BIT |  14,
    fsErrMax
  };
  /**
[7 Feb 2007 10:56] Stewart Smith
new patch

Attachment: bug20121.patch (text/x-patch), 1.38 KiB.

[14 Feb 2007 5:49] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/19835

ChangeSet@1.2435, 2007-02-14 16:49:40+11:00, stewart@willster.(none) +1 -0
  BUG#20121 missing err msg for ENOSPC
  
  getting BACKUP_FRAGMENT_REF in LQH from BACKUP would bail on
  ndbrequire(false) instead of having good error message.
  
  Can re-use error code from BACKUP as it's a FsRef error code,
  which is NDBD_EXIT... except when it isn't.
[14 Feb 2007 5:51] Stewart Smith
looked at by Tomas and Jonas and they've given it the okay.
[14 Feb 2007 10:51] Stewart Smith
pushed to 5.1-ndb

previous versions don't do LCP in this way, so the error may be handled correctly or be different.
[26 Feb 2007 1:38] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented bugfix in 5.1.16 changelog.