Bug #74831 | InnoDB table flags in bitfield is non-optimal | ||
---|---|---|---|
Submitted: | 13 Nov 2014 4:02 | Modified: | 15 Jul 2016 0:47 |
Reporter: | Stewart Smith | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S5 (Performance) |
Version: | 5.7.7 | OS: | Linux |
Assigned to: | CPU Architecture: | Any | |
Tags: | innodb, performance, POWER8, PowerPC |
[13 Nov 2014 4:02]
Stewart Smith
[13 Nov 2014 4:20]
Stewart Smith
I think I figured it out. Before: 408: e9 29 00 28 ld r9,40(r9) 40c: 79 23 3f e3 rldicl. r3,r9,39,63 we have a double word load (64bit) while afterwards we have a lwx instruction instead, which means we have a 32bit load. This is likely to be the source of the performance improvement - and it's likely going to be doing half the loads all over the place.
[11 Dec 2014 2:38]
Stewart Smith
Actually.. since TF_BITS is only 7, if you used an unsigned char here you may be able to get away with a lb instruction (load byte) being generated, which could be even better!
[11 Jan 2015 22:24]
Stewart Smith
This seems to also be dependant on compiler version. GCC 4.9 seems to make better decisions than GCC 4.8 (at least on POWER, I have not looked anywhere else).
[10 Apr 2015 7:59]
Stewart Smith
Code is still like this in 5.7.7 On modern Ubuntu (e.g. Ubuntu Vivid) and the GCC it ships with, this is less of an issue. however, the checking still does chew up a bit of time, although it's not as severe as it is on big endian without this patch on slightly older gcc.
[14 Jul 2016 17:53]
MySQL Verification Team
Vast majority of our users utilize Intel or Intel-like CPUs. With 5.7.13 and gcc > 4.8.4, I fail to see the problem: 560 if (dict_table_is_comp(index->table)) { 0x0000000001a6557d <+152>: mov -0x20(%rbp),%rax 0x0000000001a65581 <+156>: mov 0x20(%rax),%rax 0x0000000001a65585 <+160>: mov %rax,%rdi 0x0000000001a65588 <+163>: callq 0x1a6293a <dict_table_is_comp(dict_table_t const*)> Just FYI, this is compiled on Linux ..... If there is a problem at all, it is a problem of GCC producing a code as you described on those two CPUs .....
[15 Jul 2016 0:47]
Stewart Smith
560 if (dict_table_is_comp(index->table)) { 0x0000000001a6557d <+152>: mov -0x20(%rbp),%rax 0x0000000001a65581 <+156>: mov 0x20(%rax),%rax 0x0000000001a65585 <+160>: mov %rax,%rdi 0x0000000001a65588 <+163>: callq 0x1a6293a <dict_table_is_comp(dict_table_t const*)> This is just doing a function call, you'd need to look at the ASM for dict_table_is_comp itself. Anyway, because this bug has languished for so long, it's likely that everybody has moved to a more recent compiler... which isn't excusing C bitfields as a good idea.
[15 Jul 2016 14:46]
MySQL Verification Team
Hi Stewart, First of all, in 5.7.13 dict_table_is_comp() is extremely simple: /********************************************************************//** Check whether the table uses the compact page format. @return TRUE if table uses the compact page format */ UNIV_INLINE ibool dict_table_is_comp( /*===============*/ const dict_table_t* table) /*!< in: table */ { ut_ad(table); #if DICT_TF_COMPACT != 1 #error "DICT_TF_COMPACT must be 1" #endif return(table->flags & DICT_TF_COMPACT); } I took a look at 5.5 and 5.6 and it is also the same. Getting the code for the inlined function is not easy, but I got it ..... Here it is: 0x0000000001a6293a <+0>: 55 push %rbp 0x0000000001a6293b <+1>: 48 89 e5 mov %rsp,%rbp 0x0000000001a6293e <+4>: 48 83 ec 10 sub $0x10,%rsp 0x0000000001a62942 <+8>: 48 89 7d f8 mov %rdi,-0x8(%rbp) 0x0000000001a62946 <+12>: 48 83 7d f8 00 cmpq $0x0,-0x8(%rbp) 0x0000000001a6294b <+17>: 0f 94 c0 sete %al 0x0000000001a6294e <+20>: 0f b6 c0 movzbl %al,%eax 0x0000000001a62951 <+23>: 48 85 c0 test %rax,%rax 0x0000000001a62954 <+26>: 74 18 je 0x1a6296e <dict_table_is_comp(dict_table_t const*)+52> 0x0000000001a62956 <+28>: ba 7e 02 00 00 mov $0x27e,%edx 0x0000000001a6295b <+33>: 48 8d 35 d6 25 7c 00 lea 0x7c25d6(%rip),%rsi # 0x2224f38 0x0000000001a62962 <+40>: 48 8d 3d c9 26 7c 00 lea 0x7c26c9(%rip),%rdi # 0x2225032 0x0000000001a62969 <+47>: e8 7e 52 12 00 callq 0x1b87bec <ut_dbg_assertion_failed(char const*, char const*, unsigned long)> 0x0000000001a6296e <+52>: 48 8b 45 f8 mov -0x8(%rbp),%rax 0x0000000001a62972 <+56>: 0f b6 40 34 movzbl 0x34(%rax),%eax 0x0000000001a62976 <+60>: 0f b6 c0 movzbl %al,%eax 0x0000000001a62979 <+63>: 83 e0 01 and $0x1,%eax 0x0000000001a6297c <+66>: c9 leaveq 0x0000000001a6297d <+67>: c3 retq You can forget all the code until 0x1a62972, because it is there only in debug version. As you can see the produced code is very, very simple on Intel. And this is done by using GCC 4.8.3 without any optimizations in the compile arguments, not even -O. And still, it is very, very simple .... Hence, it is my conclusion that this is a problem with GCC compiler for POWER8 and PowerPC. This is a pity, since both are fantastic CPUs.