Description:
In MySQL Cluster 7.6 based on MySQL server 5.7 we recently hit a compiler bug on Solaris 11 on sparc while fixing Bug#24526123 ADAPTIVE SEND ALGORITHM IS BROKEN.
Bug has been reported to GCC: Bug 78807 - Loop optimization trigger bus error (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78807).
Faulty code was generated with -O3 -fPIC for two loops, first loop doing bitor on two 32-bit word arrays, followed by a second loop clearing one of the arrays (see example below).
Compiler generated code as if both arrays was 8-byte aligned, while it was "only" 4-byte aligned, with effect that a 8-byte word instruction 'clrx' triggered bus error.
The bad code generation was worked around by changing place of two members in struct containing the two arrays such that both array was 8-byte aligned.
Even so, we should detect this compiler bug, and turn of optimization if found such that bad compilation is avoided other yet undetected or future code.
void
trp_client::flush_send_buffers()
{
[...]
m_flushed_nodes_mask.bitOR(m_send_nodes_mask);
m_send_nodes_cnt = 0;
m_send_nodes_mask.clear();
}
$ pstack core | c++filt
------------ lwp# 4 / thread# 4 ---------------
00000001003d34cc trp_client::flush_send_buffers() (100801970, 0, 0, 0, 0, 0) + 140
00000001003cffc0 ClusterMgr::startup() (100801970, 0, 3, 72e8, 100801d8c, 0) + 50
00000001003d00b0 ClusterMgr::threadMain() (100801970, ff000000, ff000000, ffffffff7e98efc0, ffffffff7e4b80c0, ffffffff7e991904) + 10
00000001003d0534 runClusterMgr_C (0, 10073edcc, ffffffff7e98efc0, ffffffff7df01240, ffffffff7df01240, ff000000)
+ 8
00000001003f7c5c ndb_thread_wrapper (0, 0, 0, 1003f7ba0, 1006db3f0, 1006b02d8) + bc
ffffffff7e8494e0 _lwp_start (0, 0, 0, 0, 0, 0)
$ dis -nF _ZN10trp_client18flush_send_buffersEv ndb_mgmd
disassembly for ndb_mgmd
...
0x1003d34b8: de 06 22 90 ld [%i0 + 0x290], %o7
0x1003d34bc: f2 06 22 94 ld [%i0 + 0x294], %i1
0x1003d34c0: f4 06 22 4c ld [%i0 + 0x24c], %i2
0x1003d34c4: f6 06 22 50 ld [%i0 + 0x250], %i3
0x1003d34c8: c0 26 20 48 clr [%i0 + 0x48]
** 0x1003d34cc: c0 76 22 4c clrx [%i0 + 0x24c] !!!! write not on 8 byte boundary, %i0 is trp_client* which should be aligned on 8 byte
0x1003d34d0: b4 17 00 1a or %i4, %i2, %i2
0x1003d34d4: b6 17 40 1b or %i5, %i3, %i3
0x1003d34d8: f8 06 22 54 ld [%i0 + 0x254], %i4
0x1003d34dc: fa 06 22 58 ld [%i0 + 0x258], %i5
0x1003d34e0: f4 26 22 78 st %i2, [%i0 + 0x278]
0x1003d34e4: f6 26 22 7c st %i3, [%i0 + 0x27c]
...
How to repeat:
Compile and run program in extracted test program below, used in filed gcc bug.
/*
$ uname -a
SunOS vimur10 5.11 11.1 sun4v sparc sun4v
$ gcc --version
gcc (GCC) 5.3.0
$ gcc -m64 -O3 -fPIC -o bug bug.c
$ ./bug
Bus Error (core dumped)
Building with at least one of the below options avoids failure:
-fvect-cost-model=cheap
-fno-tree-loop-distribute-patterns
-fno-tree-loop-vectorize
*/
inline void g(unsigned size, unsigned x[], unsigned y[])
{
unsigned i;
for (i = 0; i < size; i++)
{
x[i] |= y[i];
}
for (i = 0; i < size; i++)
{
y[i] = 0;
}
}
struct A
{
long a; // Make struct A 8 byte aligned
int b; // Make x[] not 8 byte aligned
unsigned x[6];
unsigned y[6];
};
void f(struct A* a)
{
g(6, a->x, a->y);
}
int
main()
{
struct A a;
f(&a);
return 0;
}
Suggested fix:
Add checks in cmake file for above test case, and turn of tree loop optimizations if bug found.
Also consider to use other compiler for Solaris on sparc, since this is the second bug found in short time for gcc 5.3.0, see
Bug #24947597 SHIFT-OR COMPILER BUG WITH GCC -FEXPENSIVE-OPTIMIZATIONS