Bug #103587 INNODB_ROWS_UPDATED / n_rows_updated can become imprecise under heavy load
Submitted: 5 May 2021 9:43 Modified: 18 Mar 19:20
Reporter: Laurents Meyer (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:any OS:Any
Assigned to: CPU Architecture:Any

[5 May 2021 9:43] Laurents Meyer
Description:
The TODO status variable is used by the TechEmpower Framework Benchmarks for MySQL verifying the expected UPDATE count after a benchmark run.

The framework tests different scenarios from little to high loads (high concurrency).

When running 10240 updates, divided into 512 concurrent calls, MySQL regularly only reports between 10237 and 10239 updates, even though all 10240 updates where successfully executed.

It seems, that the srv_stats.n_rows_updated.inc(); calls increment the stats in an concurrency unsafe way:

  /** If you can't use a good index id. Increment by 1. */
  void inc() UNIV_NOTHROW { add(1); }

  /** If you can't use a good index id.
  @param n is the amount to increment */
  void add(Type n) UNIV_NOTHROW {
    size_t i = m_policy.offset(m_policy.get_rnd_index());

    ut_ad(i < UT_ARR_SIZE(m_counter));

    m_counter[i] += n;
  }

The current workaround is to heuristically ignore the difference, if it is within an expected margin, which defeats the purpose of verifying the accuracy/health of the benchmark before running it.

How to repeat:
Run 10240 updates, divided in 512 (truly) concurrent connections, against any MySQL InnoDB table and check the expected INNODB_ROWS_UPDATED (and INNODB_ROWS_READ) count against the actual one. They should differ in about 4 out of 5 runs.

Suggested fix:
I would propose to use interlocked increment operations for inc() and interlocked add operations for add(), for a maximum in performance and counting accuracy at the same time.
[5 May 2021 9:44] Laurents Meyer
The initial sentence should have been:

The INNODB_ROWS_UPDATED status variable [...]
[7 May 2021 22:30] Sunny Bains
Atomics are expensive. n_rows_updates is not meant be to 100% accurate.

SELECT ROW_COUNT(); is the way to get total the number of rows updated. See https://dev.mysql.com/doc/refman/8.0/en/information-functions.html#function_row-count
[8 May 2021 1:25] Laurents Meyer
Atomic operations are implemented to be as performant as possible and are the most inexpensive way to atomically change the value of a variable.

The x86 standard implements CPU instructions for these operations (see https://en.wikipedia.org/wiki/Compare-and-swap#Implementations for a quick overview).

Since UPDATE operations on InnoDB are **very** expensive (even in comparison to other database systems like PostgreSQL, which are very expensive themselves), it does not matter in reality, whether the operation takes 3 nanoseconds for an interlocked increment, instead of 1 nanosecond for a non-interlocked increment. (In addition to that, the current implementation in MySQL isn't even optimized, and already more expensive than it needs to be.)

Therefore, this bug should still be fixed, because it has no practical performance downsides and only operational upsides.

---

Using ROW_COUNT() is not applicable in the TechEmpower Framework Benchmark scenarios, because issuing additional statements would falsify the benchmark itself. Instead, the status/instrumentation variables are being read before and after the benchmark, to ensure that the benchmark performed correctly.

Therefore, an accurate status variable count is necessary here and since the current counts are inaccurate, this bug needs to be fixed.

---

So to summarize, the risk is low, the gain is high, the performance impact negligible and the implementation very simple.
[10 May 2021 9:26] MySQL Verification Team
I agree with Sunny about n_rows_updated not expected to be 100% accurate but our documentation does not reflect this so I'm verifying the bug so that our documentation team can add this information to:

https://dev.mysql.com/doc/refman/8.0/en/server-status-variables.html#statvar_Innodb_rows_u...

so that is clear this is not expected to always be 100% accurate and that ROW_COUNT() ( https://dev.mysql.com/doc/refman/8.0/en/information-functions.html#function_row-count ) should be used for 100% accurate data.

Thanks for the report
Bogdan
[10 May 2021 9:37] Laurents Meyer
What is the reason for not fixing it?

Implementing an interlocked increment is going to lead to faster performance than the current implementation to call the add() method, and will result in a reliable counter.

It does not break existing applications.

So I don't see *any* reason, why it could be preferable to keep the current (faulty) behavior.

(It also makes the counter pretty useless, if you cannot rely on its value.)
[10 May 2021 19:05] Sunny Bains
I'm aware of the intel documentation but you underestimate the impact of cache synchronization and other bus related overhead of atomics. It's not as if we don't know what atomics do (or how to use them). Multiple threads trying to increment a variable atomically without proper barriers in place (which in this case would be std::memory_order_relaxed) has a significant overhead when you are dealing with more threads than CPUs. We use atomics where we  have no other  choice and/or where it makes sense compared to more heavy duty latching.
[10 May 2021 19:08] Sunny Bains
"increment a variable atomically without proper barriers " should have been "increment a variable atomically with proper barriers "
[10 May 2021 19:11] Sunny Bains
For the record - before this change in 5.6, the overhead of counting n_rows_read/updated etc.  was costing us 5x in performance. We didn't make this change simply to make users life hard.

We can revisit the issue to check if there is an easy (and performancet) way to make these rarely used stats more accurate, but blindly advocating for atomics is not the way to go.
[21 May 2021 7:22] Laurents Meyer
I can confirm that the cost of an atomic increment is about 5 times the amount of a simple increment instruction.

Here is a locally run benchmark (using a proven benchmark library) that measures the time of 100 simple increments against 100 interlocked increments:

{code}
|               Method |     Mean |   Error |  StdDev |
|--------------------- |---------:|--------:|--------:|
|      SimpleIncrement | 116.7 ns | 0.12 ns | 0.11 ns |
| InterlockedIncrement | 525.9 ns | 0.68 ns | 0.57 ns |
{code}

So a simple increment instruction takes about 1.167 ns on my local machine (single CPU, 8 cores), while an interlocked increment takes about 5 times that amount with 5.259 ns.

Assuming around 1,000 UPDATE statements within one second (which already requires beefy hardware for MySQL, according to the TechEmpower Framework Benchmark), the 5.259 ns would take up about 0.0005259 % of the time spend for a request.

In total, it would add 5 additional microseconds for 1,000 UPDATE statements per second (1.000005 seconds instead of 1 second).
[21 May 2021 7:32] Laurents Meyer
Personally, I don't think 5 ns instead of 1 ns matters for UPDATE statements, since it is clearly disk performance that is the bottleneck.

Also, at least my locally run benchmark does in no way prove, that the 4 additional nanoseconds even matter in the end, because in a high concurrency situation, it is likely that operations need to wait more than that time for the file system anyway (assuming a storage engine with ACID compliance).

If you find, that they do actually accumulate in reality as in the benchmark, and decide that 5 us is too much, than I propose an option to opt-in to concurrency safe stats (or make stats concurrency safe in general and the option about enabling stats in general).
[18 Mar 19:20] Philip Olson
Posted by developer:
 
Thank you Laurents for the detailed report, it's been documented as requested by the developers.