Bug #120414 InnoDB performance regression because of validate_rec_offset
Submitted: 7 May 15:19 Modified: 11 May 12:42
Reporter: Pawel Olchawa Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:8.0.32 - 9.7 OS:Any
Assigned to: CPU Architecture:Any

[7 May 15:19] Pawel Olchawa
Description:
InnoDB used to have "validate" functions that are executed in debug builds to perform various checks.

However, in mysql-8.0.32, a validation mechanism was added, which is also executed in the release build: validate_rec_offsets().

It's been added to a very low-level and quite frequently used function (on accesses to record fields).

This is an extra call to a function defined in rem0wrec.cc, and it will not be inlined in rec_offs_nth_extern() unless we built with LTO enabled.

Apart from that, these are extra instructions and branching that we don't need in the release (as we also skip other similar validation mechanisms).

The performance regression is visible in the workload, which executes a select query joining many rows.

The difference is also visible in the perf report - in the subgraph rooted at row_search_mvcc.

Side note
=========

The same tests reveal that there are also other significant performance regressions for this workload:
- between 8.0.46 and 8.4.9
- between 8.4.9 and 9.7.0

My quick guess would be that changes related to INSTANT DDL might be related to that (row versions).

How to repeat:
Test description
================

0. Use a bare metal server and ensure that you have configured it properly for stable testing results (stddev < 1%).

1. Use the default MySQL configuration and load test_db from: https://github.com/datacharmer/test_db
(note: you may use an older client if needed)

2. Run the MySQL client for the employees database and paste these lines:

analyze table employees;
analyze table salaries;
set profiling = 1;
set profiling_history_size = 100;
pager cut -f1 > /dev/null;

3. Then paste all these selects from a text file, which has 110+ identical lines:

SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000;
SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000;
SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000;
[...]
SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000;

4. Then paste:

set profiling = 0;
nopager;
select avg(duration), std(duration) from information_schema.profiling where state='executing';

(other than 'executing' states have tiny values - don't matter, but feel free to include them)

Test results
============

Notes:
- 150 queries have been executed one by one for each test run (using a single connection)
- stats are from the last 100 queries (set profiling_history_size = 100)
- built without LTO, without PGO - using g++-13.3.0 @ubuntu 24.04, CMAKE_BUILD_TYPE=Release
- default MySQL server configuration

+-----------------+-------------------------+-------------------------+--------------------+
| MySQL Version   | Original                | After fix               | Gain  (+- 2σ %)    |
+-----------------+-------------------------+-------------------------+--------------------+
| 8.0.31          | 0.633s (std: 0.0027)    | nothing to fix          | NA                 |
| 8.0.46          | 0.665s (std: 0.0019)    | 0.636s (std: 0.0024)    | -4.4% (+- 0.8%)    |
| 8.4.9           | 0.719s (std: 0.0016)    | 0.697s (std: 0.0027)    | -3.0% (+- 0.8%)    |
| 9.7.0           | 0.743s (std: 0.0007)    | 0.709s (std: 0.0019)    | -4.6% (+- 0.6%)    |
+-----------------+-------------------------+-------------------------+--------------------+

As you can see, there are also other regressions:
- between 8.0.46 and 8.4.9    ~  +8.1%  (after fix:  +9.6%)
- between 8.4.9 and 9.7.0     ~  +3.3%  (after fix:  +1.7%)
- between 8.0.46 and 9.7.0    ~ +11.7%  (after fix: +11.5%)

Suggested fix:
After applying this fix, both results are better, and the perf report changes accordingly:

diff --git a/storage/innobase/include/rem0wrec.h b/storage/innobase/include/rem0wrec.h
index 5cb3dabb5a1..1d56fa62db6 100644
--- a/storage/innobase/include/rem0wrec.h
+++ b/storage/innobase/include/rem0wrec.h
@@ -134,6 +134,7 @@ data field in an old-style record.
 ulint rec_get_nth_field_offs_old(const dict_index_t *index, const rec_t *rec,
                                  ulint n, ulint *len);
 
+#ifdef UNIV_DEBUG
 /** Validates offset and field number.
 @param[in]      index   record descriptor
 @param[in]      offsets array returned by rec_get_offsets()
@@ -141,6 +142,7 @@ ulint rec_get_nth_field_offs_old(const dict_index_t *index, const rec_t *rec,
 @param[in]      L       Line number of calling satement*/
 void validate_rec_offset(const dict_index_t *index, const ulint *offsets,
                          ulint n, ut::Location L);
+#endif /* UNIV_DEBUG */
 
 /** Returns nonzero if the extern bit is set in nth field of rec.
 @param[in]      index           record descriptor
@@ -153,7 +155,7 @@ void validate_rec_offset(const dict_index_t *index, const ulint *offsets,
     n = index->get_field_off_pos(n);
   }
 
-  validate_rec_offset(index, offsets, n, UT_LOCATION_HERE);
+  ut_d(validate_rec_offset(index, offsets, n, UT_LOCATION_HERE));
   /* Returns nonzero if the extern bit is set in nth field of rec. */
   return rec_offs_base(offsets)[1 + n] & REC_OFFS_EXTERNAL;
 }
diff --git a/storage/innobase/rem/rem0wrec.cc b/storage/innobase/rem/rem0wrec.cc
index e627e2ab27c..983fb51b9bb 100644
--- a/storage/innobase/rem/rem0wrec.cc
+++ b/storage/innobase/rem/rem0wrec.cc
@@ -33,6 +33,7 @@ this program; if not, write to the Free Software Foundation, Inc.,
 #include "rem0wrec.h"
 #include "rem0lrec.h"
 
+#ifdef UNIV_DEBUG
 #ifndef UNIV_NO_ERR_MSGS
 /** Dumps metadata of table.
 @param[in]      table   InnoDB table object*/
@@ -74,6 +75,7 @@ void validate_rec_offset(const dict_index_t *index, const ulint *offsets,
 #endif /* !UNIV_NO_ERR_MSGS */
   }
 }
+#endif /* UNIV_DEBUG */
 
 const byte *rec_get_nth_field_old(const dict_index_t *index, const rec_t *rec,
                                   ulint n, ulint *len) {
@@ -129,7 +131,7 @@ ulint rec_offs_nth_sql_null(const dict_index_t *index, const ulint *offsets,
     n = index->get_field_off_pos(n);
   }
 
-  validate_rec_offset(index, offsets, n, UT_LOCATION_HERE);
+  ut_d(validate_rec_offset(index, offsets, n, UT_LOCATION_HERE));
   return (rec_offs_nth_sql_null_low(offsets, n));
 }
 
@@ -139,7 +141,7 @@ ulint rec_offs_nth_default(const dict_index_t *index, const ulint *offsets,
     n = index->get_field_off_pos(n);
   }
 
-  validate_rec_offset(index, offsets, n, UT_LOCATION_HERE);
+  ut_d(validate_rec_offset(index, offsets, n, UT_LOCATION_HERE));
   return (rec_offs_nth_default_low(offsets, n));
 }
 
@@ -149,7 +151,7 @@ ulint rec_offs_nth_size(const dict_index_t *index, const ulint *offsets,
     n = index->get_field_off_pos(n);
   }
 
-  validate_rec_offset(index, offsets, n, UT_LOCATION_HERE);
+  ut_d(validate_rec_offset(index, offsets, n, UT_LOCATION_HERE));
   return (rec_offs_nth_size_low(offsets, n));
 }
[11 May 12:42] Pawel Olchawa
I've checked for MySQL 9.7 built with LTO enabled.

+----------------------+---------------+-----------------------+
|MySQL                 |  Avg time     | Std dev               |
+----------------------+---------------+-----------------------+
| Original 9.7 LTO=ON  |  0.6670073100 | 0.0010818610695926185 |
| Patched  9.7 LTO=ON  |  0.6425033000 | 0.0024609245274896150 |
+----------------------+---------------+-----------------------+

Patch improvement for 9.7 when LTO is enabled: -3.7% (+- 0.8%)