Bug #110643 Sporadic RCU unit test crashes on Arm64 due to insufficient memory barrier
Submitted: 10 Apr 2023 5:45 Modified: 10 Apr 2023 7:28
Reporter: Cai Yibo (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Tests Severity:S2 (Serious)
Version:8.0 OS:Any
Assigned to: CPU Architecture:ARM (aarch64)
Tags: Contribution

[10 Apr 2023 5:45] Cai Yibo
Description:
Unit test "my_rcu_lock_test.multi_threaded_run" suffers from sporadic
segfault on an 80 core Arm64 server. It's due to memory ordering bug
in RCU implementation.

Current code can be simplified as below:

writer thread
-------------
oldT = rcu_global_.exchange(newT, std::memory_order_release);  // set new RCU
while (rcu_readers_.load(std::memory_order_relaxed) > 0)       // wait for all readers
  ;
delete oldT;                                                   // delete old RCU

reader thread
-------------
rcu_readers_.fetch_add(1, std::memory_order_relaxed);   // get lock
visit(rcu_global_);                                     // might visit deleted RCU !!!!!!
rcu_readers_.fetch_sub(1, std::memory_order_relaxed);   // release lock

The problem is there's no memory ordering guarantee at both the writer
and reader side using std::memory_order_relaxed. The reader may visit
a deleted RCU and crash.

We should use the default sequential consistency memory order.

How to repeat:
Run "my_rcu_lock_test.multi_threaded_run" repeatedly on Arm64 multi-core machine.
[10 Apr 2023 5:46] Cai Yibo
patch

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: 0001-Fix-RCU-memory-order-bug-on-Arm.patch (text/x-patch), 2.66 KiB.

[10 Apr 2023 7:28] MySQL Verification Team
Hello Cai Yibo,

Thank you for the report and contribution.

regards,
Umesh