Bug #50691 AIX implementation of readdir_r causes InnoDB errors
Submitted: 28 Jan 2010 14:47 Modified: 19 Jun 2010 17:42
Reporter: Andrew Hutchings Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version: OS:Any
Assigned to: Jimmy Yang CPU Architecture:Any

[28 Jan 2010 14:47] Andrew Hutchings
Description:
os_file_readdir_next_file uses:

ret = readdir_r(dir, (struct dirent*)dirent_buf, &ent);

The code then assumes at the end of the directory ret = 0 and ent = NULL.  Unfortunately in the AIX implementation at the end of a directory ret = 9 and ent = NULL.  This causes a lot of errors in the error log.

See the following for the implementation:

http://publib.boulder.ibm.com/infocenter/aix/v6r1/topic/com.ibm.aix.basetechref/doc/basetr...

How to repeat:
.
[29 Jan 2010 3:33] Jimmy Yang
Sent for review.

The return code for readdir_r() is a bit special on IBM, and causing trouble with its usage in os_file_readdir_next_file() in os/os0file.c:
-------------------------------------------------
ret = readdir_r(dir, (struct dirent*)dirent_buf, &ent);

if (ret != 0) {
     report error ...
}

if (ent == NULL) {
     /* End of directory */

      return(1);
}
-------------------------------------------------

For linux & solaris platforms, the ret value tells whether the operation is successful. And the "ent" tells whether the end of directory is reached. And the "end of directory" is considered a successful operation.

However, on IBM AIX, the "end of directory" is reported with "ent" set to NULL AND "ret" set to 9 (error).

And here is the information from IBM manual on return code of readdir_r:
-------------------------------------------------
int readdir_r (DirectoryPointer, Entry, Result)
0 Indicates that the subroutine was successful.
9 Indicates that the subroutine was not successful or that the end of the directory was reached.

When it reaches the end of the directory, the readdir_r subroutine returns 9 and sets the Result parameter to NULL. When it detects an invalid seekdir operation,the readdir_r subroutine returns a 9.
-------------------------------------------------

In short, the main difference is that IBM categorizes the "end of directory" along with "unsuccessful call" of the subroutine, while other platforms categorize "end of directory" as a "successful call". In our implementation, we should not report an "unsuccessful call" when "end of directory" is reached (for AIX).

However, the way AIX defines the return code misses some other scenario, such as return value of 0, and result parameter is null (which defines the reach of the end of directory in other platforms), presumably it consider such scenario will not happen. Its example in the manual shows it rely on result parameter to detect whether the end of directory is reached.

As such, for the fix, it is still reasonable to consider NULL value in the result parameter ("ent" in our case) as "end of directory" regardless the return value for all platforms. And we would make such (result value to be NULL) an exception when report error situation when return value is none NULL for AIX platform.
[29 Jan 2010 6:52] Jimmy Yang
Patch to be tested is pretty straight forward:

===================================================================
--- os/os0file.c        (revision 6526)
+++ os/os0file.c        (working copy)
@@ -759,7 +759,13 @@
 #ifdef HAVE_READDIR_R
        ret = readdir_r(dir, (struct dirent*)dirent_buf, &ent);

-       if (ret != 0) {
+       if (ret != 0
+#ifdef UNIV_AIX
+           /* On AIX, end of directory comes with return value
+           of 9, and we should not treated as an error */
+           && ent != NULL
+#endif
+          ) {
                fprintf(stderr,
                        "InnoDB: cannot read directory %s, error %lu\n",
                        dirname, (ulong)ret);
===================================================================
[2 Feb 2010 16:24] Andrew Hutchings
Tested and confirmed that patch fixes this for AIX.
[11 Feb 2010 10:21] Jimmy Yang
------------------------------------------------------------------------
r6669 | jyang | 2010-02-11 02:24:19 -0800 (Thu, 11 Feb 2010) | 7 lines

branches/5.1: Fix bug #50691, AIX implementation of readdir_r
causes InnoDB errors. readdir_r() returns an non-NULL value
in the case of reaching the end of a directory. It should
not be treated as an error return.
[26 Feb 2010 9:05] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/101563

3364 Sergey Vojtovich	2010-02-26
      Applying InnoDB snapshot, fixes BUG#50691
      
      Detailed revision comments:
      
      r6669 | jyang | 2010-02-11 12:24:19 +0200 (Thu, 11 Feb 2010) | 7 lines
      branches/5.1: Fix bug #50691, AIX implementation of readdir_r
      causes InnoDB errors. readdir_r() returns an non-NULL value
      in the case of reaching the end of a directory. It should
      not be treated as an error return.
      
      rb://238 approved by Marko
[1 Mar 2010 8:44] Bugs System
Pushed into 5.1.45 (revid:joro@sun.com-20100301083827-xnimmrjg6bh33o1o) (version source revid:svoj@sun.com-20100226123413-rrzvxtm09jy39a9w) (merge vers: 5.1.45) (pib:16)
[2 Mar 2010 14:36] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100302142746-u1gxdf5yk2bjrq3e) (version source revid:alik@sun.com-20100301095421-4cz64ibem1h2quve) (merge vers: 6.0.14-alpha) (pib:16)
[2 Mar 2010 14:41] Bugs System
Pushed into 5.5.3-m2 (revid:alik@sun.com-20100302072233-t3uqgjzdukt1pyhe) (version source revid:alexey.kopytov@sun.com-20100226131009-mch7mua4vfxs2bno) (merge vers: 5.5.3-m2) (pib:16)
[2 Mar 2010 14:46] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100302072432-k8xvfkgcggkwgi94) (version source revid:alik@sun.com-20100301094128-lohp5kgno1o5t6t6) (pib:16)
[1 Apr 2010 11:58] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/104861

3507 Sergey Vojtovich	2010-04-01
      Applying InnoDB snapshot, fixes BUG#50691.
      
      Detailed revision comments:
      
      r6724 | marko | 2010-02-17 15:52:05 +0200 (Wed, 17 Feb 2010) | 11 lines
      branches/zip: Merge revisions 6613:6669 from branches/5.1:
        ------------------------------------------------------------------------
        r6669 | jyang | 2010-02-11 12:24:19 +0200 (Thu, 11 Feb 2010) | 7 lines
      
        branches/5.1: Fix bug #50691, AIX implementation of readdir_r
        causes InnoDB errors. readdir_r() returns an non-NULL value
        in the case of reaching the end of a directory. It should
        not be treated as an error return.
      
        rb://238 approved by Marko
        ------------------------------------------------------------------------
[6 Apr 2010 7:59] Bugs System
Pushed into 5.1.46 (revid:sergey.glukhov@sun.com-20100405111026-7kz1p8qlzglqgfmu) (version source revid:svoj@sun.com-20100401151005-c6re90vdvutln15d) (merge vers: 5.1.46) (pib:16)
[5 May 2010 15:13] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[8 May 2010 17:33] Paul DuBois
Noted in 5.1.46, 5.5.4 changelogs.

The AIX implementation of readdir_r() caused InnoDB errors.
[28 May 2010 6:11] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:39] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 7:07] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 May 2010 15:37] Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[15 Jun 2010 8:14] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100615080459-smuswd9ooeywcxuc) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (merge vers: 5.1.47) (pib:16)
[15 Jun 2010 8:30] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100615080558-cw01bzdqr1bdmmec) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (pib:16)
[17 Jun 2010 12:16] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:04] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:44] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)