Bug #52261 64 bit atomic operations do not work on Solaris i386 gcc in debug compilation
Submitted: 21 Mar 2010 19:16 Modified: 12 Aug 2010 19:43
Reporter: Vladislav Vaintroub Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:next-mr-bugfixing OS:Solaris (intel, 32 bit, gcc)
Assigned to: Davi Arnaut
Triage: Triaged: D1 (Critical)

[21 Mar 2010 19:16] Vladislav Vaintroub
Description:
When compiling on OpenSolaris (x86) mode, debug compiled server crashes in
my_atomic_add64().

Moreover, unit test my_atomic-t test crashes in my_atomic_add64 too

How to repeat:
branch mysql-next-mr-bugfixing

export CC=gcc
export CXX=g++

mkdir build
cd build
cmake .. -DWITH_DEBUG=1
make

make test will say:
  9/ 11 Testing my_atomic                     ***Failed 

trying to run mysql-test-run crashes in bootstrapper
mysql-test-run: *** ERROR: Error executing mysqld --bootstrap
Could not install system database from /export/home/wlad/bzr/bf/xxx/mysql-test/..

pstack ./var/install.db/core shows crash in my_atomic_cas64

 fed22f62 sigacthandler (b, 0, feb1ece0) + f2
 --- called from signal handler with signal 11 (SIGSEGV) ---
 081f6527 _Z15my_atomic_cas64PVxPxx (88a9330, feb1ef10, 2, 0) + 35
 081f64d4 _Z15my_atomic_add64PVxx (88a9330, 1, 0, 81f61bf) + 2e
 08204da7 _Z13next_query_idv (90f15f4) + 17
 081f6347 _Z21handle_bootstrap_implP3THD (90f15e8, feb20a40, feb1efa8, 81f66a5)

Suggested fix:
Either fix self-written x86 assembly to work in all cases and all compilation modes, or remove x86-gcc.h entirely.

Recent compilers and OSes on x86 where MySQL atomics are implemented all have atomic functionality provided by either compiler or OS, which is stable and has been tested by lot of folks . In the case given, even if default OpenSolaris gcc 3.4.3 does not provide atomics, Solaris atomics can be used.
[9 Jun 2010 5:12] Davi Arnaut
I've stumbled upon this while working on some compiler warnings (unsigned/signedness mismatch).

The asm constructs look suspicious: missing memory clobbers, the first operand for cmpxchg is wrong (read-write operand must be on register).

The transparent union attribute is misplaced, which explains the warning. A attribute specifier may either go after the struct/union or after the closing brace.

Also, as can be verified by looking at the implementation of atomic by other project, the atomic code doesn't need to be this convoluted.. specially given so few uses of it.
[10 Jun 2010 6:56] Kristian Nielsen
my_atomic fix bugs with extra cmp references

Attachment: my_atomic_cas_fix_extra_cmp_reference (application/octet-stream, text), 2.30 KiB.

[10 Jun 2010 6:56] Kristian Nielsen
my_atomic patch for adding "memory" asm clobbers as compiler write barriers

Attachment: my_atomic_use_memory_clobber (application/octet-stream, text), 2.88 KiB.

[10 Jun 2010 6:57] Kristian Nielsen
Attached two related patches contributed under SCA, as discussed with Davi and on the internals@ mailing list.
[8 Jul 2010 16:17] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/113160

3095 Davi Arnaut	2010-07-08
      Bug#22320: my_atomic-t unit test fails
      Bug#52261: 64 bit atomic operations do not work on Solaris i386
                 gcc in debug compilation
      
      One of the various problems was that the source operand to
      CMPXCHG8b was marked as a input/output operand, causing GCC
      to use the EBX register as the destination register for the
      CMPXCHG8b instruction. This could lead to crashes as the EBX
      register is also implicitly used by the instruction, causing
      the value to be potentially garbaged and a protection fault
      once the value is used to access a position in memory.
      
      Another problem was the lack of proper clobbers for the atomic
      operations and, also, a discrepancy between the implementations
      for the Compare and Set operation. The specific problems are
      described and fixed by Kristian Nielsen patches:
      
      Patch: 1
      
      Fix bugs in my_atomic_cas*(val,cmp,new) that *cmp is accessed
      after CAS succeds.
      
      In the gcc builtin implementation, problem was that *cmp was
      read again after atomic CAS to check if old *val == *cmp;
      this fails if CAS is successful and another thread modifies
      *cmp in-between.
      
      In the x86-gcc implementation, problem was that *cmp was set
      also in the case of successful CAS; this means there is a
      window where it can clobber a value written by another thread
      after successful CAS.
      
      Patch 2:
      
      Add a GCC asm "memory" clobber to primitives that imply a
      memory barrier.
      
      This signifies to GCC that any potentially aliased memory
      must be flushed before the operation, and re-read after the
      operation, so that read or modification in other threads of
      such memory values will work as intended.
      
      In effect, it makes these primitives work as memory barriers
      for the compiler as well as the CPU. This is better and more
      correct than adding "volatile" to variables.
     @ include/atomic/gcc_builtins.h
        Do not read from *cmp after the operation as it might be
        already gone if the operation was successful.
     @ include/atomic/nolock.h
        Prefer system provided atomics over the broken x86 asm.
     @ include/atomic/x86-gcc.h
        Do not mark source operands as input/output operands.
        Add proper memory clobbers.
     @ include/my_atomic.h
        Add notes about my_atomic_add and my_atomic_cas behaviors.
     @ unittest/mysys/my_atomic-t.c
        Remove work around, if it fails, there is either a problem
        with the atomic operations code or the specific compiler
        version should be black-listed.
[19 Jul 2010 7:44] Konstantin Osipov
Same  patch as for bug 22320
[23 Jul 2010 21:31] Davi Arnaut
Queued to mysql-trunk-bugfixing
[27 Jul 2010 13:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/114448

3155 Davi Arnaut	2010-07-27
      Bug#52261: 64 bit atomic operations do not work on Solaris i386 ..
      
      Workaround a interface problem with the atomic macros that was
      causing warnings. The correct type is retrieved using typeof if
      compiling with GCC.
[4 Aug 2010 7:50] Bugs System
Pushed into mysql-trunk 5.5.6-m3 (revid:alik@sun.com-20100731131027-1n61gseejyxsqk5d) (version source revid:marko.makela@oracle.com-20100621094008-o9fa153s3f09merw) (merge vers: 5.1.49) (pib:18)
[4 Aug 2010 8:11] Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alik@ibmvm-20100804080001-bny5271e65xo34ig) (version source revid:marko.makela@oracle.com-20100621094008-o9fa153s3f09merw) (merge vers: 5.1.49) (pib:18)
[4 Aug 2010 8:26] Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alik@ibmvm-20100804081533-c1d3rbipo9e8rt1s) (version source revid:marko.makela@oracle.com-20100621094008-o9fa153s3f09merw) (merge vers: 5.1.49) (pib:18)
[4 Aug 2010 9:05] Bugs System
Pushed into mysql-next-mr (revid:alik@ibmvm-20100804081630-ntapn8bf9pko9vj3) (version source revid:marko.makela@oracle.com-20100621094008-o9fa153s3f09merw) (pib:20)
[12 Aug 2010 19:43] Paul Dubois
Noted in 5.5.6 changelog.

Problems in the atomic operations implementation could lead to server crashes.