Bug #76135 data corruption on arm64
Submitted: 3 Mar 2015 17:11 Modified: 12 Aug 2015 15:21
Reporter: Daniel Frazier Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:5.6.23 OS:Linux (Ubuntu 15.04)
Assigned to: CPU Architecture:Any

[3 Mar 2015 17:11] Daniel Frazier
Description:
Linaro discovered a data corruption issue in MariaDB that also affects MySQL:
  https://mariadb.atlassian.net/browse/MDEV-6615

How to repeat:
Reproducibility varies across SoCs but, at least on one SoC, it reliably fails within seconds with these sysbench commands:

sysbench --num-threads=128 --max-requests=0 --max-time=1000 --test=oltp prepare
sysbench --num-threads=128 --max-requests=0 --max-time=1000 --test=oltp run

Suggested fix:
I've ported these patches over to 5.6 and verified they fix the issue. Patch is available here:
  https://bugs.launchpad.net/ubuntu/+source/mysql-5.6/+bug/1427406
(See vivid.patch)
[4 Mar 2015 0:42] Daniel Frazier
Note that the proposed fix from MariaDB apparently introduced a regression on x86:
  https://mariadb.atlassian.net/browse/MDEV-7026

But the proposed fix for that regression brings back the arm64 problem:
  https://mariadb.atlassian.net/browse/MDEV-7658
[31 Mar 2015 19:29] Daniel Frazier
Patch 1/4

Attachment: MDEV-6450-MariaDB-crash-on-Power8-when-built-with-advance-tool-chain.patch (text/x-patch), 2.31 KiB.

[31 Mar 2015 19:29] Daniel Frazier
Patch 2/4

Attachment: MDEV-6483-Deadlock_around_rw_lock_debug_mutex_on_PPC64.patch (text/x-patch), 3.88 KiB.

[31 Mar 2015 19:30] Daniel Frazier
Patch 3/4

Attachment: MDEV-7026-Race-in-InnoDB-XtraDB-mutex-implementation.patch (text/x-patch), 12.33 KiB.

[31 Mar 2015 19:30] Daniel Frazier
Patch 4/4

Attachment: MDEV-7658-fix-AArch64-MDEV-6615-regression.patch (text/x-patch), 2.14 KiB.

[31 Mar 2015 19:33] Daniel Frazier
I've attached my port of the fixes from MariaDB, and modified them to apply to ARM64 in addition to PowerPC.

For more details, see:
 https://bugs.launchpad.net/ubuntu/+source/mysql-5.6/+bug/1427406/comments/5
[17 Jul 2015 5:02] Stewart Smith
The __sync_synchronize() is wrong, you don't require a heavyweight sync.

The correct thing to use would be __atomic_test_and_set() and __atomic_clear.
[17 Jul 2015 6:08] Sunny Bains
After discussing with Stewart, the change looks like this on Linux. If the new primitives are available then we use them on Intel x86 too. Otherwise we fall back to the old code on platforms with strong memory ordering.

# if defined(HAVE_IB_GCC_ATOMIC_TEST_AND_SET)

/**********************************************************//**
Do an atomic test-and-set.
@param[in,out]  ptr             Memory location to set to non-zero
@return the previous value */
inline
lock_word_t
os_atomic_test_and_set(volatile lock_word_t* ptr)
{
       return(__atomic_test_and_set(ptr, __ATOMIC_ACQUIRE));
}

/**********************************************************//**
Do an atomic clear.
@param[in,out]  ptr             Memory location to set to zero */
inline
void
os_atomic_clear(volatile lock_word_t* ptr)
{
        __atomic_clear(ptr, __ATOMIC_RELEASE);
}

# elif defined(IB_STRONG_MEMORY_MODEL)

/**********************************************************//**
Do an atomic test and set.
@param[in,out]  ptr             Memory location to set to non-zero
@return the previous value */
inline
lock_word_t
os_atomic_test_and_set(volatile lock_word_t* ptr)
{
        return(__sync_lock_test_and_set(ptr, 1));
}

/**********************************************************//**
Do an atomic release.

In theory __sync_lock_release should be used to release the lock.
Unfortunately, it does not work properly alone. The workaround is
that more conservative __sync_lock_test_and_set is used instead.

Performance regression was observed at some conditions for Intel
architecture. Disable release barrier on Intel architecture for now.
@param[in,out]  ptr             Memory location to write to
@return the previous value */
inline
lock_word_t
os_atomic_clear(volatile lock_word_t* ptr)
{
        return(__sync_lock_test_and_set(ptr, 0));
}

# else

#  error "Unsupported platform"

# endif /* HAVE_IB_GCC_ATOMIC_TEST_AND_SET */
[10 Aug 2015 8:53] Shaohua Wang
Posted by developer:
 
The root cause is that x86 has a stronger memory model than the ARM processors. And the GCC builtins didn't issue the correct fences when setting/unsetting the lock word. In particular during the mutex release.

The solution is rewriting atomic TAS operations: replace '__sync_' by '__atomic_' if possible.
[12 Aug 2015 15:21] Daniel Price
Fixed as of the upcoming 5.5.46, 5.6.27, 5.7.9, 5.8.0 releases, and here's the changelog entry:

A data corruption occurred on ARM64. GCC builtins did not issue the
correct fences when setting or unsetting the lock word. 

Thank you for the bug report.
[3 Dec 2015 8:17] Laurynas Biveinis
The fix for this has caused bug 79185.
[3 Dec 2015 8:44] Sunny Bains
Laurynas,  Can you please post your HW/OS/Compiler versions here.
[3 Dec 2015 9:38] Laurynas Biveinis
Several combinations:
1) 5.5 / IBM X3650M4, 2x Intel Xeon E5-2620, 64G RAM, RAID 10 (4x 240G SSD) / Ubuntu Trusty / GCC 4.8.4
2) 5.6 / N/A / CentOS 6.6 / GCC 4.8.2
3) 5.6 / don't have details handy right now, will post later if I get them
[3 Dec 2015 10:20] Sunny Bains
Laurynas, Can you also please post the asm from the mutex enter and release code that is generated.
[3 Dec 2015 10:44] Sunny Bains
gcc -O3 -S -fverbose-asm main.c

cat main.s
	.file	"main.c"
# GNU C (Debian 4.8.1-8) version 4.8.1 (x86_64-linux-gnu)
#	compiled by GNU C version 4.8.1, GMP version 5.1.2, MPFR version 3.1.1-p2, MPC version 1.0.1
# warning: GMP header version 5.1.2 differs from library version 6.0.0.
# warning: MPFR header version 3.1.1-p2 differs from library version 3.1.2-p3.
# GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
# options passed:  -imultiarch x86_64-linux-gnu main.c -mtune=generic
# -march=x86-64 -O3 -fverbose-asm
# options enabled:  -faggressive-loop-optimizations
# -fasynchronous-unwind-tables -fauto-inc-dec -fbranch-count-reg
# -fcaller-saves -fcombine-stack-adjustments -fcommon -fcompare-elim
# -fcprop-registers -fcrossjumping -fcse-follow-jumps -fdefer-pop
# -fdelete-null-pointer-checks -fdevirtualize -fdwarf2-cfi-asm
# -fearly-inlining -feliminate-unused-debug-types -fexpensive-optimizations
# -fforward-propagate -ffunction-cse -fgcse -fgcse-after-reload -fgcse-lm
# -fgnu-runtime -fguess-branch-probability -fhoist-adjacent-loads -fident
# -fif-conversion -fif-conversion2 -findirect-inlining -finline
# -finline-atomics -finline-functions -finline-functions-called-once
# -finline-small-functions -fipa-cp -fipa-cp-clone -fipa-profile
# -fipa-pure-const -fipa-reference -fipa-sra -fira-hoist-pressure
# -fira-share-save-slots -fira-share-spill-slots -fivopts
# -fkeep-static-consts -fleading-underscore -fmath-errno -fmerge-constants
# -fmerge-debug-strings -fmove-loop-invariants -fomit-frame-pointer
# -foptimize-register-move -foptimize-sibling-calls -foptimize-strlen
# -fpartial-inlining -fpeephole -fpeephole2 -fpredictive-commoning
# -fprefetch-loop-arrays -free -freg-struct-return -fregmove
# -freorder-blocks -freorder-functions -frerun-cse-after-loop
# -fsched-critical-path-heuristic -fsched-dep-count-heuristic
# -fsched-group-heuristic -fsched-interblock -fsched-last-insn-heuristic
# -fsched-rank-heuristic -fsched-spec -fsched-spec-insn-heuristic
# -fsched-stalled-insns-dep -fschedule-insns2 -fshow-column -fshrink-wrap
# -fsigned-zeros -fsplit-ivs-in-unroller -fsplit-wide-types
# -fstrict-aliasing -fstrict-overflow -fstrict-volatile-bitfields
# -fsync-libcalls -fthread-jumps -ftoplevel-reorder -ftrapping-math
# -ftree-bit-ccp -ftree-builtin-call-dce -ftree-ccp -ftree-ch
# -ftree-coalesce-vars -ftree-copy-prop -ftree-copyrename -ftree-cselim
# -ftree-dce -ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre
# -ftree-loop-distribute-patterns -ftree-loop-if-convert -ftree-loop-im
# -ftree-loop-ivcanon -ftree-loop-optimize -ftree-parallelize-loops=
# -ftree-partial-pre -ftree-phiprop -ftree-pre -ftree-pta -ftree-reassoc
# -ftree-scev-cprop -ftree-sink -ftree-slp-vectorize -ftree-slsr -ftree-sra
# -ftree-switch-conversion -ftree-tail-merge -ftree-ter
# -ftree-vect-loop-version -ftree-vectorize -ftree-vrp -funit-at-a-time
# -funswitch-loops -funwind-tables -fvect-cost-model -fverbose-asm
# -fzero-initialized-in-bss -m128bit-long-double -m64 -m80387
# -maccumulate-outgoing-args -malign-stringops -mfancy-math-387
# -mfp-ret-in-387 -mfxsr -mglibc -mieee-fp -mlong-double-80 -mmmx -mno-sse4
# -mpush-args -mred-zone -msse -msse2 -mtls-direct-seg-refs

	.text
	.p2align 4,,15
	.globl	old
	.type	old, @function
old:
.LFB11:
	.cfi_startproc
	xorl	%eax, %eax	# tmp61
	xchgl	(%rdi), %eax	#,* t, tmp60
	ret
	.cfi_endproc
.LFE11:
	.size	old, .-old
	.p2align 4,,15
	.globl	tas
	.type	tas, @function
tas:
.LFB12:
	.cfi_startproc
	movl	$1, %eax	#, tmp60
	movq	%rdi, -8(%rsp)	# t, t
	xchgb	-8(%rsp), %al	#,, tmp59
	ret
	.cfi_endproc
.LFE12:
	.size	tas, .-tas
	.p2align 4,,15
	.globl	clear
	.type	clear, @function
clear:
.LFB13:
	.cfi_startproc
	movb	$0, -8(%rsp)	#,,
	ret
	.cfi_endproc
.LFE13:
	.size	clear, .-clear
	.section	.text.startup,"ax",@progbits
	.p2align 4,,15
	.globl	main
	.type	main, @function
main:
.LFB14:
	.cfi_startproc
	xorl	%eax, %eax	# tmp61
	movl	$65535, -40(%rsp)	#, t
	xchgl	-40(%rsp), %eax	#,, tmp60
	leaq	-40(%rsp), %rax	#, tmp62
	movq	%rax, -24(%rsp)	# tmp62, t
	movl	$1, %eax	#, tmp64
	xchgb	-24(%rsp), %al	#,, tmp63
	movb	$0, -16(%rsp)	#,,
	ret
	.cfi_endproc
.LFE14:
	.size	main, .-main
	.ident	"GCC: (Debian 4.8.1-8) 4.8.1"
	.section	.note.GNU-stack,"",@progbits

cat main.c

#include <stdio.h>

void old(int* t)
{
	__sync_lock_test_and_set(t, 0);
}

void tas(int* t)
{
	__atomic_test_and_set(&t, __ATOMIC_ACQUIRE);
}

void clear(int* t)
{
	__atomic_clear(&t, __ATOMIC_RELEASE);
}

int main()
{
	int	t = 0xffff;

	old(&t);

	tas(&t);

	clear(&t);
}

The clear() doesn't seem to have any barrier, looks like it defaults to RELAXED on x86_64.
[3 Dec 2015 11:15] Laurynas Biveinis
Ubuntu Trusty, GCC 4.8.4.

$ /usr/bin/cc  -DHAVE_CONFIG_H -DHAVE_IB_ATOMIC_PTHREAD_T_GCC=1 -DHAVE_IB_GCC_ATOMIC_BUILTINS=1 -DHAVE_IB_GCC_ATOMIC_TEST_AND_SET=1 -DLINUX_NATIVE_AIO=1 -DSIZEOF_PTHREAD_T=8 -fPIC -Wall -Wextra -Wformat-security -Wvla -Wwrite-strings -Wdeclaration-after-statement -O3 -g -static-libgcc -fno-omit-frame-pointer -fno-strict-aliasing -DDBUG_OFF -DMY_PTHREAD_FASTMUTEX=1 -I/home/laurynas/percona/obj-5.5-release/include -I/home/laurynas/percona/mysql-server/storage/innobase/include -I/home/laurynas/percona/mysql-server/storage/innobase/handler -I/home/laurynas/percona/mysql-server/include -I/home/laurynas/percona/mysql-server/sql -I/home/laurynas/percona/mysql-server/regex -I/home/laurynas/percona/mysql-server/extra/yassl/include -I/home/laurynas/percona/mysql-server/extra/yassl/taocrypt/include -I/home/laurynas/percona/mysql-server/zlib    -DUNIV_LINUX -D_GNU_SOURCE=1 -o CMakeFiles/innobase.dir/sync/sync0sync.c.I -S -fverbose-asm /home/laurynas/percona/mysql-server/storage/innobase/sync/sync0sync.c -fno-inline

Current MySQL 5.5.46 os_atomic_clear is MOV:

	.p2align 4,,15
	.type	os_atomic_clear, @function
os_atomic_clear:
.LFB71:
	.file 3 "/home/laurynas/percona/mysql-server/storage/innobase/include/os0sync.h"
	.loc 3 338 0
	.cfi_startproc
.LVL12:
	pushq	%rbp	#
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	.loc 3 339 0
	movb	$0, (%rdi)	#,,* ptr
	.loc 3 338 0
	movq	%rsp, %rbp	#,
	.cfi_def_cfa_register 6
	.loc 3 340 0
	popq	%rbp	#
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE71:
	.size	os_atomic_clear, .-os_atomic_clear

With a quick and dirty fix to prefer IB_STRONG_MEMORY_MODEL it's XCHG:

	.p2align 4,,15
	.type	os_atomic_clear, @function
os_atomic_clear:
.LFB71:
	.file 3 "/home/laurynas/percona/mysql-server/storage/innobase/include/os0sync.h"
	.loc 3 368 0
	.cfi_startproc
.LVL12:
	pushq	%rbp	#
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	.loc 3 369 0
	xorl	%eax, %eax	# tmp62
	.loc 3 368 0
	movq	%rsp, %rbp	#,
	.cfi_def_cfa_register 6
	.loc 3 369 0
	xchgb	(%rdi), %al	#,* ptr, D.14153
	.loc 3 370 0
	popq	%rbp	#
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE71:
	.size	os_atomic_clear, .-os_atomic_clear
[3 Dec 2015 23:43] Sunny Bains
diff --git a/storage/innobase/include/os0sync.h b/storage/innobase/include/os0sync.h
index 0762840..8768d43 100644
--- a/storage/innobase/include/os0sync.h
+++ b/storage/innobase/include/os0sync.h
@@ -336,7 +336,7 @@ static inline
 void
 os_atomic_clear(volatile lock_word_t* ptr)
 {
-       __atomic_clear(ptr, __ATOMIC_RELEASE);
+       __atomic_clear(ptr, __ATOMIC_SEQ_CST);
 }
 
 # elif defined(IB_STRONG_MEMORY_MODEL)

The only thing that needs to be investigated is if SEQ_CST is too strong on non Intel HW e.g., ARM64.
[22 Mar 2018 20:57] MySQL Verification Team
https://bugs.mysql.com/bug.php?id=77795 marked as duplicate of this one.