Bug #61179 | Unoptimized versions of *korr/*store macros in my_global.h are used on x86_64 | ||
---|---|---|---|
Submitted: | 15 May 2011 15:45 | Modified: | 5 Jun 2013 12:27 |
Reporter: | Alexey Kopytov | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: General | Severity: | S3 (Non-critical) |
Version: | 5.1, 5.5 | OS: | Any |
Assigned to: | Tor Didriksen | CPU Architecture: | Any |
[15 May 2011 15:45]
Alexey Kopytov
[17 May 2011 18:09]
MySQL Verification Team
Kaamos, my friend, I was not able to repeat the problem. I have new iMac, which has 64-bit CPU. I was checking your bug by getting pre-processor output only , for mi_key.c source file, with the following command: gcc -DHAVE_CONFIG_H -I. -I../../include -I../../include -I../../include -I../../regex -I../../sql -I. -g -DSAFE_MUTEX -DSAFEMALLOC -Wimplicit -Wreturn-type -Wswitch -Wtrigraphs -Wcomment -W -Wchar-subscripts -Wformat -Wparentheses -Wsign-compare -Wwrite-strings -Wunused-function -Wunused-label -Wunused-value -Wunused-variable -Wunused-parameter -mtune=nocona -DUNIV_MUST_NOT_INLINE -DEXTRA_DEBUG -DFORCE_INIT_OF_VARS -DSAFEMALLOC -DPEDANTIC_SAFEMALLOC -DSAFE_MUTEX -D_P1003_1B_VISIBLE -DSIGNAL_WITH_VIO_CLOSE -DSIGNALS_DONT_BREAK_READ -DIGNORE_SIGHUP_SIGQUIT -DDONT_DECLARE_CXA_PURE_VIRTUAL -MT mi_key.o -MD -MP -MF .deps/mi_key.Tpo -E mi_key.c If you look at the bottom of the mi_key.c source file, you will see one switch command. The end of that command , after pre-processing, looks like this: case HA_KEYTYPE_LONGLONG: s_value= (*((longlong *) (key))); break; case HA_KEYTYPE_ULONGLONG: value= (*((ulonglong *) (key))); break; default: (__builtin_expect(!(0), 0) ? __assert_rtn(__func__, "mi_key.c", 557, "0") : (void)0); value=0; break; which shows that it works nicely with those macros. Tell me, please, if I am wrong and where am I wrong .....
[17 May 2011 19:02]
Alexey Kopytov
Dear Sinisa, This is what I get from the same command using 4.1.2 on a 64-bit Linux machine: case HA_KEYTYPE_LONGLONG: s_value= (longlong) ((ulonglong)(((uint32) ((uchar) (key)[0])) + (((uint32) ((uchar) (key)[1])) << 8) + (((uint32) ((uchar) (key)[2])) << 16) + (((uint32) ((uchar) (key)[3])) << 24)) + (((ulonglong) (((uint32) ((uchar) (key)[4])) + (((uint32) ((uchar) (key)[5])) << 8) + (((uint32) ((uchar) (key)[6])) << 16) + (((uint32) ((uchar) (key)[7])) << 24))) << 32)); break; case HA_KEYTYPE_ULONGLONG: value= ((ulonglong)(((uint32) ((uchar) (key)[0])) + (((uint32) ((uchar) (key)[1])) << 8) + (((uint32) ((uchar) (key)[2])) << 16) + (((uint32) ((uchar) (key)[3])) << 24)) + (((ulonglong) (((uint32) ((uchar) (key)[4])) + (((uint32) ((uchar) (key)[5])) << 8) + (((uint32) ((uchar) (key)[6])) << 16) + (((uint32) ((uchar) (key)[7])) << 24))) << 32)); break; default: ((0) ? (void) (0) : (__assert_fail ("0", "mi_key.c", 557, __PRETTY_FUNCTION__), (void) (0))); value=0; break; } It is the same with gcc 4.2.1 on Mac OS X. I also disassembled that function in mysql-5.5.10-linux2.6-x86_64.tar.gz downloaded from dev.mysql.com. It's hard to read optimized code, but there are clearly lots of shifts in that code: ... 0x00000000008e0cdc <retrieve_auto_increment+380>: xchg %ax,%ax 0x00000000008e0ce0 <retrieve_auto_increment+384>: movzbl 0x5(%rsi),%ecx 0x00000000008e0ce4 <retrieve_auto_increment+388>: movzbl 0x6(%rsi),%eax 0x00000000008e0ce8 <retrieve_auto_increment+392>: movzbl 0x1(%rsi),%edx 0x00000000008e0cec <retrieve_auto_increment+396>: shl $0x10,%eax 0x00000000008e0cef <retrieve_auto_increment+399>: shl $0x8,%ecx 0x00000000008e0cf2 <retrieve_auto_increment+402>: add %eax,%ecx 0x00000000008e0cf4 <retrieve_auto_increment+404>: movzbl 0x4(%rsi),%eax 0x00000000008e0cf8 <retrieve_auto_increment+408>: shl $0x8,%edx 0x00000000008e0cfb <retrieve_auto_increment+411>: add %eax,%ecx 0x00000000008e0cfd <retrieve_auto_increment+413>: movzbl 0x7(%rsi),%eax 0x00000000008e0d01 <retrieve_auto_increment+417>: shl $0x18,%eax 0x00000000008e0d04 <retrieve_auto_increment+420>: add %eax,%ecx 0x00000000008e0d06 <retrieve_auto_increment+422>: movzbl 0x2(%rsi),%eax 0x00000000008e0d0a <retrieve_auto_increment+426>: shl $0x20,%rcx 0x00000000008e0d0e <retrieve_auto_increment+430>: shl $0x10,%eax 0x00000000008e0d11 <retrieve_auto_increment+433>: add %eax,%edx 0x00000000008e0d13 <retrieve_auto_increment+435>: movzbl (%rsi),%eax 0x00000000008e0d16 <retrieve_auto_increment+438>: add %eax,%edx 0x00000000008e0d18 <retrieve_auto_increment+440>: movzbl 0x3(%rsi),%eax 0x00000000008e0d1c <retrieve_auto_increment+444>: shl $0x18,%eax 0x00000000008e0d1f <retrieve_auto_increment+447>: add %eax,%edx 0x00000000008e0d21 <retrieve_auto_increment+449>: lea (%rcx,%rdx,1),%rax 0x00000000008e0d25 <retrieve_auto_increment+453>: jmpq 0x8e0bd0 <retrieve_auto_increment+112> ... Are you sure the compiler on your machine generates 64-bit code with those options? What is the output of "gcc -dM -E - </dev/null | grep 86" in your case?
[18 May 2011 5:03]
Jonas Oreland
I got the "unoptimized" versions...does it matter that I tried debug build... probably not... gcc (Gentoo 4.4.3-r2 p1.2) 4.4.3 Linux eel 2.6.34.5-jonas #5 SMP Mon Nov 22 14:03:35 CET 2010 x86_64 Intel(R) Core(TM) i5 CPU M 520 @ 2.40GHz GenuineIntel GNU/Linux [fk-patches/]:jonas@eel:~/src/telco-7.0/storage/myisam> file mi_key.o mi_key.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
[18 May 2011 17:13]
MySQL Verification Team
Kaamos, Jonas, If I insist on the tuning to be done for Nocona, which is strictly 64-bit CPU, I get fully optimized code, with gcc 4.0.1. I will try 4.2.1 once I buy USB disk, do backup, install Snow Leopard etc .... ;o) However, with -m64 I get non-optimized code, which is kind of strange. So, fixing is necessary, but could you tell me why do you think that this is a bug in our code and not in GCC ??? Why couldn't they (GCC) support i386 on 64-bit as well ????
[5 Jun 2013 12:27]
Tor Didriksen
Fixed in 5.6 by patch tor.didriksen@oracle.com-20120111093352-wca2tmilh4sq9sd5
[5 Jun 2013 16:45]
Paul DuBois
Noted in 5.6.13 changelog. Unoptimized versions of the xxxkorr() macros in my_global.h were used on 64-bit x86 processors.