| Bug #47213 | All InnoDB (builtin and plugin) is unstable at PowerPC SMP server | ||
|---|---|---|---|
| Submitted: | 9 Sep 11:46 | Modified: | 20 Oct 10:54 |
| Reporter: | Yasufumi Kinoshita | ||
| Status: | Open | ||
| Category: | Server: InnoDB | Severity: | S3 (Non-critical) |
| Version: | 5.0, 5.1.36 + plugin 1.0.3 | OS: | Linux (PowerPC SMP) |
| Assigned to: | Target Version: | ||
| Tags: | Contribution | ||
[9 Sep 11:46]
Yasufumi Kinoshita
[9 Sep 11:47]
Yasufumi Kinoshita
Current patch example for InnoDB Plugin 1.0.3
Attachment: innodb_1.0.3_sync_fix_for_ppc_smp.patch (text/x-patch), 8.33 KiB.
[9 Sep 17:32]
Vasil Dimov
Let's check whether __sync_synchronize() is available separately from other GCC builtin functions and define a macro like HAVE__SYNC_SYNCRONIZE if it is available instead of using HAVE_GCC_ATOMIC_BUILTINS. Thank you!
[9 Sep 19:52]
Sergei Golubchik
Where did you read that PowerPC don't maintain cache coherence automatically ?
[10 Sep 2:12]
Yasufumi Kinoshita
https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/F7E732FF811F783187256FDD004D3797 "The Programming Environments Manual for 64-bit Microprocessors" 5.1.1.2 Synchronize Instruction
[10 Sep 8:39]
Sergei Golubchik
I think you read it incorrectly. Check this one "970 User Manuals": https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/DC3D43B729FDAD2C00257419006FB955 for example, it says explicitly "The 970FX automatically maintains the coherency of all data cached <...>" all other PowerPC User Manuals say basically the same. Some go into details explaining bus snooping and the particular protocol. Synchronization instructions are different - that's for memory coherence between CPUs. And "sync" is just a glorified memory barrier (see p.532 in the "Programming Environment") which is important for mutexes, for example, (you want to be sure that all accesses to the mutex-protected memory are executed by the CPU *after* you locked the mutex and *before* you unlocked it). But it has nothing to do with caches. On on Intel the situation is the same - caches are coherent. Accesses to the shared memory may need barriers. What's different - on Intel in certain cases one needs to use LOCK prefix for an instruction to be really atomic, and LOCK prefix is an implicit full memory barrier. On PPC (probably on any non-Intel), I presume, one needs to add a memory barrier explicitly.
[10 Sep 12:33]
Yasufumi Kinoshita
OK. I should use another expression. "InnoDB depends on the memory ordering of Intel CPU." gcc's __sync_synchronize() does nothing at x86, x86_64, but "sync" at ppc64. I also think from my experience that Intel SMP doesn't need barrier for reading the memory which is always changed by atomic operation from another processor. (atomic operation affect to memory ordering of the another processor? by locking memory or..?) It is I called "maintain consistency automatically". But PowerPC SMP seems not. The atomic operation seems to "loop until success". The operation seems more passive operation than Intel's. So, it may not affect to the other processor's already-fetched values.
[10 Sep 13:46]
Sergei Golubchik
the fact that gcc __sync_synchronize() does nothing at x86, x86_64 is a bug. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36793 You don't need a barrier if all your atomic instructions use LOCK prefix (because LOCK prefix *is* an implicit full memory barrier). But memory reads and writes are often atomic without any LOCK prefix, and then you need to write memory barriers explicitly.
[10 Sep 21:00]
Yasufumi Kinoshita
Sergei, I know what the LOCK prefix is... I am writing about the interaction between processors. I don't writing about memory barrier of atomic operation itself. (memory barrier is not affect to the other CPU affect only ) If intel-CPU, when CPU-1 does atomic operations using LOCK prefix, the target memory (and cached value in all CPU) is locked or arbitrated actively. (So, it should affect (or be affected) to CPU-2's prefetch/cache of the memory ordering (CPU-2 respects CPU-1 automatically?)) At the time, if CPU-2 have to read the memory with consistency, the barrier before the reading is not so needed by experience. (Recently, InnoDB on x86 x86_64 SMP seems not to hangup...) The problem here is whether the explicit barrier before the "CPU-2" reading is needed or not. (In other words, the atomic operation of CPU1 causes implicit local-memory barrier effect to CPU2 reading? or not?) In ppc64 case, the atomic operation seems to be barriered trial loop. So, I think it doesn't lock or arbitrate actively, and I suspect doesn't affect to the another CPU's prefetched value (cache may be affected by the some cache coherent mechanism). So CPU2 needed barrier explicitly to read consistently. Though it is my analogy from the each manual's description. But it fits to our experience and can explain the difference of hangup possibility between the architectures. How do you think about the interaction? Or did you find explicit description about it in the manuals?
[11 Sep 10:11]
Sergei Golubchik
Memory barrier affects only one CPU but it is important for multi-CPU interaction. For
example the first CPU does:
mutex=LOCKED;
var1=++;
var2--;
while another CPU does
if (mutex != LOCKED) {
mutex=LOCKED;
local=var1+var2;
...
if there will be no memory barriers between mutex and shared_var operations on both CPUs
the second CPU may end up accessing var1 and/or var2 without mutex protection and seeing
inconsistent data.
LOCK indeed locks the bus (that's why it's called "LOCK" :), but I doubt that it affect
local CPU caches in any way. Nor it works as a memory barrier on other CPUs.
Whether you need an explicit memory barrier on intel depends on what you use. InnoDB - as
far as I can see - uses only __sync_add_and_fetch() and __sync_bool_compare_and_swap() -
both need LOCK prefix to work corectly on SMP, so every atomic operation in InnoDB is
accompanied with a full memory barrier.
[11 Sep 13:15]
Yasufumi Kinoshita
Sergei, Have you tested in real by yourself? I am writing based on my trials to improve InnoDB's lock implementation in these years... <other discussion in intel.com> http://software.intel.com/en-us/forums/threading-on-intel-parallel-architectures/topic/650... http://software.intel.com/en-us/articles/single-producer-single-consumer-queue/ - "hardware fence is implicit on x86" - __memory_barrier() is equal to "__asm__ __volatile__ ("" : : : "memory")" (not MFENCE, LFENCE or SFENCE, it affects only to compiler optimizing) <in the developer's manual> Intel 64 and IA-32 Architectures Software Developer's Manual Volume 3A: System Programming Guide http://www.intel.com/Assets/PDF/manual/253668.pdf 8.2.2 Memory Ordering in P6 and More Recent Processor Families In a single-processor system ... - Reads may be reordered with older writes to different locations but not with older writes to the same location. ... In a multiple-processor system ... - Individual processors use the same ordering principles as in a single-processor system. - Writes by a single processor are observed in the same order by all processors. ... So, writing from CPU1 is not reordered with reading same place of CPU2 in the CPU2 memory ordering. In this case, memory barrier for reading (lfence or mfence) is not needed at x86, x86_64.
[14 Sep 17:34]
Sergei Golubchik
No, but I wrote above that InnoDB on x86 did not need explicit __sync_synchronize() or any other memory barriers. I hope that matches results of your experiments with InnoDB :)
[19 Oct 9:13]
Christopher Yeoh
Hi Yasufumi, Just wondering if you have an updated patch to what is attached to this bug report (since you mentioned it wasn't complete).
[20 Oct 10:50]
Yasufumi Kinoshita
The new more stable patch for InnoDB Plugin 1.0.3
Attachment: innodb_1.0.3_sync_fix_for_ppc_smp_2.patch (text/x-diff), 9.13 KiB.
[20 Oct 10:51]
Yasufumi Kinoshita
The new more stable patch for InnoDB Plugin 1.0.4
Attachment: innodb_1.0.4_sync_fix_for_ppc_smp_2.patch (text/x-diff), 8.65 KiB.
[20 Oct 10:54]
Yasufumi Kinoshita
Christopher, The new patch was added. In the end, we have to use asm() for "isync" instruction.
