Bug #16727 MySQL failing on 32bit call in a 64bit environment
Submitted: 23 Jan 2006 15:04 Modified: 28 Apr 2006 8:19
Reporter: Andrew Harrison Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:4.1.14, 5.0.18 OS:Linux (SUSE 9.1)
Assigned to: CPU Architecture:Any

[23 Jan 2006 15:04] Andrew Harrison
Description:
We have had a couple of occasions, using separate versions of the MySQL clustered db that have caused massive error logging into the system logs.

Since the first time this happened (4.1.14) we have upgraded the MySQL version to 'mysql-max-5.0.18-linux-powerpc-glibc23' and so cannot pin this to a particular version.

Errors are being produced that seem to relate to using a 32bit cmd that is not recognised by the Operating System (SUSE 9.1 64bit).  

The errors being reported in /var/log/messages are:
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(ndbd:6687): Unknown cmd fd(4) cmd(80145830){00} arg(ffffd530) on socket:[64713761]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(ndbd:6778): Unknown cmd fd(4) cmd(80145830){00} arg(ffffd530) on socket:[64714186]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(ndbd:6705): Unknown cmd fd(4) cmd(80145830){00} arg(ffffd530) on socket:[64713887]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(mysqld:6151): Unknown cmd fd(12) cmd(80145830){00} arg(431fa0a0) on socket:[64620622]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(ndbd:6778): Unknown cmd fd(4) cmd(80145830){00} arg(ffffd530) on socket:[64714186]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(ndbd:6687): Unknown cmd fd(4) cmd(80145830){00} arg(ffffd530) on socket:[64713761]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(ndbd:6687): Unknown cmd fd(4) cmd(80145830){00} arg(ffffd530) on socket:[64713761]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(mysqld:6150): Unknown cmd fd(12) cmd(80145830){00} arg(431f2080) on socket:[64620622]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(ndbd:6687): Unknown cmd fd(4) cmd(80145830){00} arg(ffffd530) on socket:[64713761]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(ndbd:6687): Unknown cmd fd(4) cmd(80145830){00} arg(ffffd530) on socket:[64713761]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(mysqld:6150): Unknown cmd fd(12) cmd(80145830){00} arg(431f2080) on socket:[64620622]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(ndbd:6728): Unknown cmd fd(4) cmd(80145830){00} arg(ffffd530) on socket:[64714022]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(ndbd:6687): Unknown cmd fd(4) cmd(80145830){00} arg(ffffd530) on socket:[64713761]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(ndbd:6687): Unknown cmd fd(4) cmd(80145830){00} arg(ffffd530) on socket:[64713761]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(mysqld:6150): Unknown cmd fd(12) cmd(80145830){00} arg(431f2080) on socket:[64620622]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(ndbd:6687): Unknown cmd fd(4) cmd(80145830){00} arg(ffffd530) on socket:[64713761]
Jan 22 02:46:24 atpucsh2 kernel: ioctl32(ndbd:6687): Unknown cmd fd(4) cmd(80145830){00} arg(ffffd530) on socket:[64713761]
 

How to repeat:
So far have not been able to force this error.
[23 Jan 2006 15:07] Andrew Harrison
Extract from .err file in /usr/local/mysql/data

Attachment: atpucsh1 mysql 20060120.txt (application/octet-stream, text), 858 bytes.

[23 Jan 2006 15:11] Andrew Harrison
Once this problem is encountered, the messages as above are written to the logs at a rate of hundreds per second and fill the remaining filesystem space, causing the box to become unresponsive.

This system is a realtime live system and so this is a serious issue.  Any help with this is appreciated.
[23 Jan 2006 15:16] Andrew Harrison
We have two servers, ATPUCSH1 and ATPUCSH2.  The latest occurance of this was on Friday 18th Jan (ATPUCSH1) and Sunday 22nd Jan (ATPUCSH2).  I have attached the appropriate extracts from the log files for these days.
[23 Jan 2006 15:17] Andrew Harrison
Extract from .err file in /usr/local/mysql/data

Attachment: atpucsh1 mysql 20060122.txt (application/octet-stream, text), 3.30 KiB.

[23 Jan 2006 15:19] Andrew Harrison
Extract from .err file in /usr/local/mysql/data

Attachment: atpucsh2 mysql 20060120.txt (application/octet-stream, text), 966 bytes.

[23 Jan 2006 15:19] Andrew Harrison
Extract from .err file in /usr/local/mysql/data

Attachment: atpucsh2 mysql 20060122.txt (application/octet-stream, text), 5.74 KiB.

[23 Jan 2006 15:20] Andrew Harrison
Extract from cluster log for node (id3) on atpucsh1

Attachment: node3.txt (application/octet-stream, text), 1.60 KiB.

[23 Jan 2006 15:45] Andrew Harrison
Extract from cluster log for node (id4) on atpucsh2

Attachment: node4.txt (application/octet-stream, text), 1.58 KiB.

[23 Jan 2006 15:46] Andrew Harrison
Extract from cluster log for node (id5) on atpucsh1

Attachment: node5.txt (application/octet-stream, text), 1.57 KiB.

[23 Jan 2006 15:46] Andrew Harrison
Extract from cluster log for node (id6) on atpucsh2

Attachment: node6.txt (application/octet-stream, text), 1.57 KiB.

[23 Jan 2006 15:46] Andrew Harrison
Extract from cluster log for node (id7) on atpucsh1

Attachment: node7.txt (application/octet-stream, text), 1.58 KiB.

[23 Jan 2006 15:47] Andrew Harrison
Extract from cluster log for node (id8) on atpucsh2

Attachment: node8.txt (application/octet-stream, text), 1.59 KiB.

[23 Jan 2006 15:47] Andrew Harrison
Extract from cluster log for node (id9) on atpucsh1

Attachment: node9.txt (application/octet-stream, text), 1.57 KiB.

[23 Jan 2006 15:47] Andrew Harrison
Extract from cluster log for node (id10) on atpucsh2

Attachment: node10.txt (application/octet-stream, text), 1.58 KiB.

[23 Jan 2006 17:31] Andrew Harrison
Having spoken to Novell, this may be a problem caused by the version of SUSE linux under which this version of MySQL has been compiled.  Could you please confirm the version of linux?
[23 Jan 2006 17:38] Andrew Harrison
Further research notes that the version we are currenlty using is available in the downloads section under:
Linux (POWER / PowerPC).

We are running SUSE 9.1 on a PowerPC box.
[24 Jan 2006 10:43] Andrew Harrison
Novell have rung back and asked which version of the Linux kernel and which flavour of linux this version was compiled against.

Could you possibly provide this information?

Many thanks
[24 Jan 2006 13:02] Lenz Grimmer
Hi Andrew, thanks a lot for the very detailed report. To answer your question: our Linux/PPC binaries are compiled on a Debian 3.1 system, glibc 2.3.2 and kernel 2.6.8-powerpc, gcc 3.3.5.

See the file "mysqlbug" for more details on the exact compile options.
[24 Jan 2006 13:34] Andrew Harrison
Thanks for your reply...

As far as I have read elsewhere, this error is caused by a call to a 32bit function that has not been implemented in the ioctl32 for 32bit emulation on a 64bit system.  Do you know of any such call, or a work around for this problem?
[25 Jan 2006 12:22] Jonas Oreland
I have never seen this before, but then I never heard of anyone
  using a 32-bit binary on a 64-bit kernel

Cant you use a 64 bit compiled version?
[25 Jan 2006 13:42] Andrew Harrison
There was some discussion about the supportability of either a precompiled binary to run on SUSE9.1 (PowerPC64) against obtaining the source and compiling in on our destination boxes.  It was decided at the time that we use a precompiled binary version as this was the most supportable option.
[25 Jan 2006 14:00] Andrew Harrison
I'm not sure if I made it clear, but the boxes will run fine for a couple of months before these messages start appearing.  When they do start appearing we get 500,000 in a ten second period which ends up using all of the remaining disk space on the box, causing MySQL to crash in an indeterminate state.
[23 Mar 2006 10:30] Valeriy Kravchuk
Have you upgraded anything on the OS level before this problem appeared? Can you try to install and check with newer version, 5.0.19?
[23 Mar 2006 14:04] Andrew Harrison
We're upgrading the boxes to SLES9 SP3 this weekend following Novell's admission that it may be a problem with the kernel version in SLES9 SP1.  Hopefully this will fix the problem, but only time will tell.
If/when the problem happens again, we will consider upgrading the version of MySQL but we need to check if this is a problem with the kernel version first.
[27 Mar 2006 12:28] Valeriy Kravchuk
Please, reopen this bug report in case of the same problem after upgrading the kernel.
[27 Apr 2006 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[28 Apr 2006 8:19] Andrew Harrison
We now believe that the problem is due to a hardware driver as we have updated the kernel.  Once we have ruled out any hardware issues we will raise another bug if necessary.

Thanks for all of your time and help