Bug #106411 Regression on DDL statements with big stage/sql/checking permissions [Reopend]
Submitted: 9 Feb 2022 3:02 Modified: 13 Nov 2023 8:26
Reporter: Matias Sánchez Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: DDL Severity:S5 (Performance)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: performance degradation

[9 Feb 2022 3:02] Matias Sánchez
Description:
Hi team!
I apologize in advance as this new bug report actually continues the comments on https://bugs.mysql.com/bug.php?id=106399 . That case had been closed as duplicate and it's not able to reopen nor comment.

Referencing the notes from verification team on https://bugs.mysql.com/bug.php?id=106399, we've been reviewing all the related bugs and we're not completely sure it's a duplicate bug report, as all links reference tasks regression times where the DML inserts instead is the main issue; but on that part  we're ok as it's a fact that multi-threaded DMLs performs actually better on 8.0.

We were actually talking specifically on the DDL portion isolated, as for example, creating 1K tables, truncating them or dropping sequentially, on 8.0 there is almost >80% of time that goes to stage/sql/checking permissions, when all this time is the main one that mark the big difference against 5.7. Is this stage showing some lack of tuning from our side or user/grants/roles schema after 8.0 expects this type of extra slowness? Maybe some disabling feature param to test against this extra stage?

After the points mentioned on the related bug report, we tried to adapt our processes to run multi-threaded DDL; but on the issolated DDL scenarios this approach actually got yet more regression times as this leads to dict locks, even when actually taking care of multithread different objects all the time.
On this multithread scenario the DDL holds stage:  "stage/sql/After create" and the waits are high on "wait/synch/sxlock/innodb/dict_operation_lock"  and "wait/synch/mutex/innodb/dict_sys_mutex". This extra waits actually lead with a global worse scenario if ddl gets multi-threaded.

We've been trying tuning this multi-threaded ddl by increasing innodb_sync_array_size and table_open_cache/_instances but couldn't be able to reduce this mutex hit. 
Any extra tip on this scenario as well to save this times?

We appreciate any additional comments on this use case.

Best regards! 

Matias

How to repeat:
In case needed , here is an update on the reproducible script sent before on related bug , but on DDL testcase multithreaded:

1. Prepare and empty schema

mysql -uroot -e "DROP DATABASE IF EXISTS ttest; CREATE DATABASE ttest"

 mysql -uroot ttest -e "
  DROP TABLE IF EXISTS fkParent; 
  CREATE TABLE fkParent (
      id int(10) unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,
      textField varchar(10)
  );"

for n in $(seq 3000); do
    [ $(($n % 100)) -eq 0 ] && echo $n
    mysql -uroot ttest -e " 
    CREATE TABLE fkChild$n (
      id int(10) unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY, 
      fkField$n int(10) unsigned,
      CONSTRAINT FK_$n FOREIGN KEY (fkField$n) REFERENCES fkParent(id)
    );"
done

2. Truncate tables

### parallel truncate 
mysql -A -BN information_schema -e"
select concat('TRUNCATE ttest.', table_name,';')
from tables
where table_schema = 'ttest';" > truncate_test.sql
### split on 8 similar parts
split -n l/8 truncate_test.sql split_
sed -i '1s/^/SET FOREIGN_KEY_CHECKS=0;\n/' split_*
sed -i '1s/^/SET UNIQUE_CHECKS=0;\n/' split_*

for file in split_*
do
    {
        START_TIME=$SECONDS 
        mysql -uroot ttest < ${file}
        ELAPSED_TIME=$((($SECONDS-$START_TIME)))
        echo "Finished adhoc run.. elapsed time: ${ELAPSED_TIME} seconds" > output.${file}
    } &
done

### after all jobs get done
sort output.split_a* | tail -n1
[9 Feb 2022 13:41] MySQL Verification Team
Hi Mr. Sanchez,

Thank you for your bug report.

We have run your test case and we have also noted that 8.0 is slower in multi-threaded DDL then 5.7.

Hence, we have decided to verify this bug as a performance issue.
[9 Feb 2022 17:34] Matias Sánchez
Thanks a lot team for verifying testcase.

Just adding some extra notes, i've been testing also changes on params like:
innodb_file_per_table
innodb_flush_log_at_trx_commit
innodb_buffer_pool_instances
table_definition_cache
table_open_cache_instances
innodb_log_checksums
innodb_flush_method
innodb_sync_array_size

But none have had any positive impact so far as multithreaded ddl nor sequential ddl global times.
Thanks a lot in advance for any news.

Best regards
[10 Feb 2022 13:32] Dmitry Lenev
Posted by developer:
 
Hello!

I would like to point out that looking for time spent on different stages as shown in P_S in TRUNCATE TABLE can be quite confusing. The problem is that different stages of TRUNCATE TABLE execution are not always mapped to/instrumented as stages which are shown in P_S/SHOW PROCESSLIST. As result you can get wrong impression that TRUNCATE TABLE spends significant time "Opening tables" (in 5.7) or "Checking permissions" while in reality it executes truncate operation in SE or does some other activity, which happens afterwards of opening tables/checking permissions but doesn't have stage instrumentation.
Also note that in 8.0 we have changed approach to how we do TRUNCATE TABLE for InnoDB tables. In 5.7 we opened table and then called InnoDB SE truncate API to do this (the latter dropped and then re-created all table indexes and thus the whole table internally). That is probably why in 5.7 you see significant time spent on "Opening tables" stage (we do not have stage for calling truncate in SE). And in 8.0 we skip opening of table and simply call InnoDB SE API calls to drop and re-create table atomically instead (and since this still don't have separate stage for these calls, this activity reflected as totally confusing "Checking permissions").

This doesn't explain why exactly there is such a big difference between 5.7 and 8.0 of course, but at least explains why the main amount of time is shown as spent on different stages.
[11 Feb 2022 1:26] Matias Sánchez
Thanks a lot team for the feedback! 

We've been actually feeling like this lack of specific instrumented stages for this use case was a fact as sequential test shows "checking permissions" status but on a multithread scenario this stage is kinda low but replaced at mutexes locks, which indicates that a check permission feature activity is not the main time itself, so it gets confusing.

Thanks a lot for your deep research and all the help!

Best regards
[11 Feb 2022 13:21] MySQL Verification Team
You are truly welcome !!!!
[13 Nov 2023 8:26] Tarang Ranpara
Posted by developer:
 
We were able to reproduce the test-case mentioned in the bug page, and here are our observations:

MySQL 8.0's mysql_create_table() function is several times slower than in MySQL 5.7 due to several new operations, including:
1. Opening and querying the data dictionary tables to check for the existence of the table being created. In MySQL 5.7, a simple filesystem check is performed instead.
2. Opening and writing to the data dictionary tables to store metadata for the new table.
3. Checking foreign key constraints in the SQL layer, starting in MySQL 8.0.
4. Creating and deleting DDL logs in InnoDB to support transactional DDL.
5. Storing metadata in the data dictionary as multiple rows in multiple tables, rather than in a single physical file (.frm) as in MySQL 5.7.

The performance regression is primarily due to the new way that metadata is stored in MySQL 8.0, but the advantages of the new approach outweigh the performance penalty. It would not be possible to match the MySQL 8.0 performance to 5.7.

References:
-----------
Advantages of DD: https://dev.mysql.com/doc/refman/8.0/en/data-dictionary.html#:~:text=Benefits%20of%20the%2....
[13 Nov 2023 11:56] MySQL Verification Team
Thank you Mr. Ranpara.

This is indeed an expected regression, because of the benefits that data dictionary provides.