Bug #116795 get_oldest_view cost too much and block other oprator
Submitted: 27 Nov 2024 5:02 Modified: 27 Nov 2024 6:15
Reporter: Li Xiangjie (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:8.0.32 OS:Any
Assigned to: CPU Architecture:Any
Tags: purge, read view

[27 Nov 2024 5:02] Li Xiangjie
Description:
   The clone_oldest_view holds the trx_sys->mutex lock to perform the get_oldest_view operation, which requires traversing the view linked list to find the oldest view that has not been closed. 
   
   Therefore, when the view linked list is too long and there are too many closed read_views, this traversal operation will become time-consuming. 

   The trx_sys->mutex is the lock for the transaction system, which needs to be acquired for writing operations in the database and can therefore block other operations on the database.

   in our scenary, tps in replica with 3.6w RO connections decrease from 2w/s  
to 0.6w/s, cpu cost of clone_oldest_view raise to 30% or even more.

   related functions: MVCC::clone_oldest_view, MVCC::view_close

How to repeat:
1. create 3.6w connections to db, each connections do a simple select; here is an example use python

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import MySQLdb as mdb
import time

config = {
    'host': 'db-host',
    'port': db-port,
    'user': 'db-user',
    'passwd': 'db-password',
    'db': 'db-name',
}

connections = []
for i in range(36000):
    try:
        conn = mdb.connect(**config)
        conn.autocommit(True)
        connections.append(conn)
        cursor=conn.cursor()
        cursor.execute("select k from sbtest1 where id=1")
    except Exception as e:
        print(e)

time.sleep(600000)

for connection in connections:
    connection.close()

2. use sysbench load pressure, here is an example:

   sysbench oltp_update_index.lua --mysql-host=host --mysql-port=port --mysql-user=user --mysql-password=password --mysql-db=db --tables=100 --db-driver=mysql --table_size=1000 --report-interval=5 --threads=256 --time=120 run
   

3. observing the performance of databases, as well as cpu cost.

   in our scenary, tps in replica with 3.6w RO connections decrease from 2w/s  
to 0.6w/s, cpu cost of clone_oldest_view raise to 30% or even more.

Suggested fix:
   Provide an optional switch to disable the linked list removal operation in read view to ensure performance in related scenarios.
[27 Nov 2024 6:15] Li Xiangjie
duplicate of another one
[27 Nov 2024 11:07] MySQL Verification Team
Hi Mr. Li,

Thank you for your bug report.

Would you be so kind as to let us know what is the number of the original bug report.

Many thanks in advance.