Bug #33484 | "Create table ... engine=falcon" fails with error 156 but Falcon is present | ||
---|---|---|---|
Submitted: | 22 Dec 2007 18:49 | Modified: | 4 Oct 2008 15:19 |
Reporter: | Joerg Bruehe | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S1 (Critical) |
Version: | 6.0.4-alpha | OS: | Linux (RHAS 3 / x86) |
Assigned to: | Vladislav Vaintroub | CPU Architecture: | Any |
[22 Dec 2007 18:49]
Joerg Bruehe
[22 Dec 2007 18:54]
Joerg Bruehe
I assume getting Falcon to work on Linux/x86 is P1
[22 Dec 2007 20:58]
Joerg Bruehe
Daniel tried the package on his home machine, I try it on our central nuild hub: Tests pass. But retrying on the original build hosts, tests reproducibly fail. I reduce "impact" and revoke "showstopper", as it seems to depend on some (yet unknown) host machine properties.
[22 Dec 2007 22:13]
Jeb Mershon
The same problem occurs on debx86 (a bteam machine)
[22 Dec 2007 23:06]
Daniel Fischer
Falcon tries to open falcon_master.fts with O_DIRECT first, and if that fails, with O_SYNC instead. On at least two of our Linux 2.4 machines, both fail. A simple test program that just opens a file with O_RDWR|O_SYNC also fails. Suggested fix: Third attempt at opening files, without any O_DIRECT/O_SYNC black magic.
[24 Dec 2007 0:17]
Jeffrey Pugh
Can anyone clarify if this is a bug related to trying to compile/run Falcon on old/particular os's, or specific machines, or we don't know?
[26 Dec 2007 16:38]
Jeffrey Pugh
Lowered priority based on the older platforms this affects.
[28 Dec 2007 9:08]
Philip Stoev
I was unable to reproduce the original issue on rhas3-x86, however I was able to do that on the debx86 machine. Unfortunately the bug is still present in the binary from production.mysql.com:/data0/mysqldev/my/mysql-6.0.4-alpha-p1-build/dist/packages/mysql-6.0.4-alpha-p1-linux-i686-glibc23.tar.gz Strace output is: [pid 23670] open("/users/pstoev/mysql-6.0.4-alpha-p1-linux-i686-glibc23/mysql-test/var/master-data/falcon_master.fts", O_RDWR|O_DIRECT|O_LARGEFILE) = -1 ENOENT (No such file or directory) [pid 23670] open("/users/pstoev/mysql-6.0.4-alpha-p1-linux-i686-glibc23/mysql-test/var/master-data/falcon_master.fts", O_RDWR|O_SYNC|O_LARGEFILE) = -1 ENOENT (No such file or directory) [pid 23670] open("/users/pstoev/mysql-6.0.4-alpha-p1-linux-i686-glibc23/mysql-test/var/master-data/falcon_master.fts", O_RDWR|O_LARGEFILE) = -1 ENOENT (No such file or directory) [pid 23670] open("falcon_master.fts", O_RDWR|O_CREAT|O_TRUNC|O_DIRECT|O_LARGEFILE, 0660) = 37
[28 Dec 2007 9:13]
Philip Stoev
Contrast with strace from unpached binary: [pid 23739] open("/users/pstoev/mysql-6.0.4-alpha-linux-i686-glibc23/mysql-test/var/master-data/falcon_master.fts", O_RDWR|O_DIRECT|O_LARGEFILE) = -1 ENOENT (No such file or directory) [pid 23739] open("/users/pstoev/mysql-6.0.4-alpha-linux-i686-glibc23/mysql-test/var/master-data/falcon_master.fts", O_RDWR|O_SYNC|O_LARGEFILE) = -1 ENOENT (No such file or directory) [pid 23739] open("falcon_master.fts", O_RDWR|O_CREAT|O_TRUNC|O_DIRECT|O_LARGEFILE, 0660) = 37 Note that the last open() call suceeds in both cases.
[28 Dec 2007 9:54]
Daniel Fischer
Sorry, what I wrote above is not accurate. It looked to me like opening the file itself failed. However, closer investigation is that it is writing to a file fails, causing falcon to abort (which led me to believe that it couldn't open the file at all). Quoting strace: [pid 1977] pwrite(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512, 0) = -1 EINVAL (Invalid argument) This is an attempt to write 512 bytes to a file that was previously opened with O_DIRECT. However, the logical block size of the file system it resides on is 4096 bytes. Quoting man 2 open: "Under Linux 2.4 transfer sizes, and the alignment of user buffer and file offset must all be multiples of the logical block size of the file system. Under Linux 2.6 alignment to 512-byte boundaries suffices." NDB was affected by a similar issue in bug#29612.
[28 Dec 2007 15:13]
Ann Harrison
A 512 byte write to a .fts file is very odd. Tablespace files are written in pages - for which the default is 4096 bytes. It's possible that the alignment is wrong, but a partial page write is very strange.
[28 Dec 2007 17:38]
Kevin Lewis
Jim Starkey indicates that there is no way that he is aware of to find out what the file system block size is on Linux. So Falcon has a hardcoded value of 512 in SerialLogFile.cpp, line 146; sectorSize = MAX(512, database->serialLogBlockSize); The default falcon_serial_log_block_size is zero, which effectively makes it 512 on all linux OSes. In tablespaces, the offset is always a multiple of the page size which can be reduced from the default of 4096 to 2048 or 1024. We do not want Falcon to increase the file block alignment for all file systems. So I think we need to document that on Linux version 2.4, the engine should be started with --falcon-serial-log-block-size=4096 and the page size should not be less than 4096.
[30 Dec 2007 19:32]
Philip Stoev
Can we use the pwrite() call on a dummy file with a 512-byte argument to determine the allowed block size?
[6 Feb 2008 20:56]
Kevin Lewis
Workaround is to use a newer version of Linux.
[12 Feb 2008 19:56]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/42138 ChangeSet@1.2813, 2008-02-12 14:56:40-05:00, vlad@ubuntu704desktop.localdomain +2 -0 Bug#33484 - Cannot create Falcon table due to error in pwrite() on Linux 2.4 on NFS file system. The problem was file system specific alignment/buffersize requirement when using O_DIRECT on Linux 2.4 ( on NFS it is 32KB), incompatible with Falcon buffer sizes. Solution: do not use O_DIRECT on Linux < 2.6, as O_DIRECT is broken here.
[12 Feb 2008 20:00]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/42139 ChangeSet@1.2813, 2008-02-12 15:00:10-05:00, vlad@ubuntu704desktop.localdomain +1 -0 Bug#33484 - Cannot create Falcon table due to error in pwrite() on Linux 2.4 on NFS file system. The problem was file system specific alignment/buffersize requirement when using O_DIRECT on Linux 2.4 ( on NFS it is 32KB), incompatible with Falcon buffer sizes. Solution: do not use O_DIRECT on Linux < 2.6, since it is too complicated. If somebody want performance , he would need 2.6 anyway.
[12 Feb 2008 20:41]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/42140 ChangeSet@1.2813, 2008-02-12 15:40:50-05:00, vlad@ubuntu704desktop.localdomain +1 -0 Bug#33484 - Cannot create Falcon table due to error in pwrite() on Linux 2.4 on NFS file system. The problem was file system specific alignment/buffersize requirement when using O_DIRECT on Linux 2.4 ( on NFS it is 32KB), incompatible with Falcon buffer sizes. Solution: do not use O_DIRECT on Linux < 2.6, since it is too complicated. If somebody want performance , he would need 2.6 anyway.
[12 Feb 2008 21:14]
Kevin Lewis
Patch approved, push after 6.0.4 is released.
[12 Mar 2008 23:02]
Bugs System
Pushed into 6.0.4-alpha
[23 Apr 2008 7:42]
Hakan Küçükyılmaz
Fix is in 6.0.5-alpha
[4 Oct 2008 15:19]
Jon Stephens
Documented in the 6.0.5 changelog as follows: CREATE TABLE ... ENGINE=Falcon failed on kernel 2.4 based Linux systems when using O_DIRECT with an NFS file system.