| Bug #63451 | atomic/x86-gcc.h:make_atomic_cas_body64 potential miscompilation bug | ||
|---|---|---|---|
| Submitted: | 28 Nov 2011 3:37 | Modified: | 3 Feb 2014 10:36 |
| Reporter: | Laurynas Biveinis (OCA) | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Locking | Severity: | S3 (Non-critical) |
| Version: | 5.5 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[2 Jan 2012 20:10]
Davi Arnaut
Simply marking the argument as a input/output operand is wrong too because of the semantics of CMPXCHG8b. The major problem is that the atomic API is too complex for a reasonable implementation in assembly.
[4 Jan 2012 10:00]
Laurynas Biveinis
Davi - More out of curiosity than anything else, could you elaborate why marking CMPXCHG8B arg as input/output is wrong? In any case I think we are spending way too much time with this. -march=i686 seems to be the best solution.
[4 Jan 2012 15:35]
Davi Arnaut
There is an explanation in the commit that changed it.
[30 Jul 2013 17:28]
MySQL Verification Team
Both arguments are correct. This bug is verified and will be processed.
[5 Nov 2013 9:55]
Marc ALFF
For context, the previous change set comment that modified the code is: Bug#22320: my_atomic-t unit test fails Bug#52261: 64 bit atomic operations do not work on Solaris i386 gcc in debug compilation One of the various problems was that the source operand to CMPXCHG8b was marked as a input/output operand, causing GCC to use the EBX register as the destination register for the CMPXCHG8b instruction. This could lead to crashes as the EBX register is also implicitly used by the instruction, causing the value to be potentially garbaged and a protection fault once the value is used to access a position in memory. Another problem was the lack of proper clobbers for the atomic operations and, also, a discrepancy between the implementations for the Compare and Set operation. The specific problems are described and fixed by Kristian Nielsen patches: Patch: 1 Fix bugs in my_atomic_cas*(val,cmp,new) that *cmp is accessed after CAS succeds. In the gcc builtin implementation, problem was that *cmp was read again after atomic CAS to check if old *val == *cmp; this fails if CAS is successful and another thread modifies *cmp in-between. In the x86-gcc implementation, problem was that *cmp was set also in the case of successful CAS; this means there is a window where it can clobber a value written by another thread after successful CAS. Patch 2: Add a GCC asm "memory" clobber to primitives that imply a memory barrier. This signifies to GCC that any potentially aliased memory must be flushed before the operation, and re-read after the operation, so that read or modification in other threads of such memory values will work as intended. In effect, it makes these primitives work as memory barriers for the compiler as well as the CPU. This is better and more correct than adding "volatile" to variables.
[22 Nov 2013 14:59]
Paul DuBois
Noted in 5.6.16, 5.7.4 changelogs. The make_atomic_cas_body64 implementation on IA32 with gcc but without gcc builtins could be miscompiled due to an incorrect constraint. The patch also causes MySQL to use builtin atomics when compiled using Clang.
[3 Feb 2014 10:36]
Laurynas Biveinis
5.6$ bzr log -r 5641
------------------------------------------------------------
revno: 5641
committer: Nisha Gopalakrishnan <nisha.gopalakrishnan@oracle.com>
branch nick: mysql-5.6-17242996
timestamp: Thu 2013-11-21 10:36:45 +0530
message:
BUG#17242996: ATOMIC/X86-GCC.H:MAKE_ATOMIC_CAS_BODY64 POTENTIAL
MISCOMPILATION BUG
Analysis:
--------
The make_atomic_cas_body64 implementation on IA32 with GCC but
without GCC builtins might get miscompiled due to a wrong
constraint.
make_atomic_cas_body64 implementation uses CMPXCHG8B with a
single memory operand as input/output operand. However the
constraint '+' was not used to indicate the same due to
issue reported by Bug#11746008. Hence it could result in
using two different memory locations, thus accessing an
uninitialized memory with probable runtime crash.
Fix:
---
The assembly atomic implementation (x86-gcc.h) is not used on
Windows and Solaris. The oldest version of GCC supported by
Linux variants on any platform is GCC 4.1.2 which supports
builtin atomics. On MAC, GCC versions higher than 4.1.2 or
Clang is used which supports builtin atomics as well.
Hence this patch removes assembly atomic implementation
(x86-gcc.h) and relies on the GCC builtins. Upon using
an older unsupported versions of GCC(A warning is flagged),
we may see some performance regressions since we fallback
to rw locks.

Description: The make_atomic_cas_body64 implementation on IA32 with GCC but without GCC builtins might get miscompiled due to a wrong constraint. The code reads #define make_atomic_cas_body64 \ asm volatile ("push %%ebx;" \ "movl (%%ecx), %%ebx;" \ "movl 4(%%ecx), %%ecx;" \ LOCK_prefix "; cmpxchg8b %0;" \ "setz %2; pop %%ebx" \ : "=m" (*a), "+A" (*cmp), "=c" (ret) \ : "c" (&set), "m" (*a) \ : "memory", "esp") #endif Note that *a is specified twice in the constraint list: as the 1st output arg ("=m" (*a)) and as the 5th input arg ("m" (*a)). The code obviously expects these two to be a single memory location (i.e. single arg of CMPXCHG8B that is both input and output), but such constraints do not guarantee this. GCC is free to use two separate memory locations - one for the input value of *a and one for the output one. Which then results to CMPXCHG8B argument pointing to uninitialized memory and runtime crash. This is pointed out by the GCC manual: http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Extended-Asm.html#Extended-Asm <quote> The ordinary output operands must be write-only; GCC will assume that the values in these operands before the instruction are dead and need not be generated. </quote> I.e. there is no need for GCC to store the input value of *a to the 1st arg. And <quote> Only a number in the constraint can guarantee that one operand will be in the same place as another. The mere fact that foo is the value of both operands is not enough to guarantee that they will be in the same place in the generated assembler code. The following would not work reliably: asm ("combine %2,%0" : "=r" (foo) : "r" (foo), "g" (bar)); Various optimizations or reloading could cause operands 0 and 1 to be in different registers; GCC knows no reason not to do so. For example, the compiler might find a copy of the value of foo in one register and use it for operand 1, but generate the output operand 0 in a different register (copying it afterward to foo's own address). Of course, since the register for operand 1 is not even mentioned in the assembler code, the result will not work, but GCC can't tell that. </quote> How to repeat: The current MySQL code does not trigger a miscompilation. We have observed it in Percona Server 5.1 where we have backported the MySQL 5.5 atomic access library. Still the fix is very simple I suggest to fix this proactively without waiting for miscompilation reports. Suggested fix: Instead of two separate input and output args use a single input/output arg: "+m" (*a) Or require at least -march=i686 for compiling MySQL and drop the code altogether.