Bug #106390 CREATE TABLE performance in MySQL 8 is considerably slower than 5.7
Submitted: 7 Feb 10:08 Modified: 8 Feb 23:49
Reporter: Jacob Thomason Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server Severity:S5 (Performance)
Version:8.0.28 OS:Debian
Assigned to: CPU Architecture:Any

[7 Feb 10:08] Jacob Thomason
Description:
We recently upgraded our database to version 8, excited to make use of CHECK constraints for some improved data integrity.  Everything went pretty smooth overall until we got to running our tests.

We saw about a 6x hit on our integration tests on version 8.  Going from about 4m on 5.7 to ~20m on MySQL 8.  I ended up spending the 3 days exploring a number of options to squeeze more performance out of MySQL 8.  Unfortunately, I was unable to do so.

For every single integration test, we import an sql dump file, basically resetting our test db to a baseline, with our default fixtures.  This works quite well and it's simple.  This results in the time difference between 5.7 and version 8.

I spent some time refactoring our logic for this to utilize the new `util.dumpSchemas` and `util.loadDump` functions through the mysql shell.  These are pretty cool and while it did seem like the performance was an improvement, there were a number of issues.

Firstly, you don't have any way to use these tools to replace a database in place.  You must `DROP` a database and then load an entire schema.  There isn't any other way to get what's needed.  I suspect this results in some additional cycles with the DROP.

Secondly, when running this process in parallel against 4 databases on the server, queries return empty values, resulting in null variable assignments and failing tests.  I was unable, despite my efforts, to find a setting causing this issue.  I tested it with a default config on version 8.  I also tested it with version 5.7.  Both versions resulted in this issue.

We've had to revert back to 5.7 for our testing.

How to repeat:
This can be replicated by running an import of a database dump in comparison between the two versions.
[7 Feb 14:09] MySQL Verification Team
Hi Mr. Thomason,

Thank you for your bug report.

However, general performance drop between 5.7 and 8.0 is very well known and documented. Your report is a duplicate of many of already verified reports, such as:

https://bugs.mysql.com/bug.php?id=90209

https://bugs.mysql.com/bug.php?id=85255 (you can't access this one)

https://bugs.mysql.com/bug.php?id=95684 (this is also a private bug)

https://bugs.mysql.com/bug.php?id=92979

and many , many others.

We do wonder, however, why have you tried that path in your degradation. You should have started from 5.7.37 database. Next, you should have made a backup of all your data. Then , you should have simply upgraded to 8.0 over same data, by following detailed description in our 8.0 Reference Manual, which you can find on http://dev.mysql.com.

That upgrade is very fast.

Duplicate ........
[7 Feb 21:25] Jacob Thomason
Thanks for the reply.  The performance issue is known - yes.  However, in the situation I've presented, it's not satisfactory and makes MySQL 8 virtually impossible to use for integration testing.

What options are there to use MySQL 8 in an integration testing environment without massive performance implications?
[7 Feb 21:37] Frederic Descamps
Hi Jacob, 
We already discussed in Slack about your problem.
As I told you maybe the test even if it was decent in 5.7 might be not following the best practices for what you wanted to check.

Could you explain what you are trying to achieve/demonstrate and what is your scenario ?

Currently there are way too much assumptions and guesses in our side.
[7 Feb 21:56] Jacob Thomason
In an integration testing environment, or end-to-end testing, our application tests the database, in addition to other services.  However, for the sake of concerns here, it's the database that's the issue.  The majority of time spent on tests is with the database, always has been and always will be - naturally.

Our tests aren't too complicated, but the most simplistic process is as follows:

- Firstly, we get a schema dump with fixture data.  This is done by creating a database with tables, loading fixture data and then using mysqldump to output a dump SQL file.
- For each test, this dump file will be inported into a specific database as part of the integration test's boot process.
- After the dump has been imported the test runs

That's really it, it's quite simple.  It wasn't as simple to get everything setup smoothly and automated, but the overall design is simple.

The SQL file we use contains about 200 tables with statements like the following:

```sql
DROP TABLE IF EXISTS `foo`;
CREATE TABLE `foo` (
    `col1` INT(10) unsigned NOT NULL AUTO_INCREMENT,
    `col2` ...
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

LOCK TABLES `foo` WRITE;
INSERT INTO `foo` VALUES (1, 'bar', ...);
UNLOCK TABLES;
```

All of this comes from the `mysqldump` script.
[7 Feb 22:45] Jacob Thomason
To illustrate the difference in performance between 5.7 and 8, please see the following 2 schema imports.  These imports are done through a sibling Docker container on the same Docker network:

MySQL 5.7.37

# time mysql -P 3306 -h mysql-test -u root api-test-1 < /srv/test/data/db-scaffold.sql

real	0m0.824s
user	0m0.039s
sys	0m0.158s

--

MySQL 8.0.28

# time mysql -P 3306 -h mysql-test -u root api-test-1 < /srv/test/data/db-scaffold.sql

real	0m3.008s
user	0m0.030s
sys	0m0.103s

MySQL 8.0.28 with REDO_LOG disabled

# time mysql -P 3306 -h mysql-test -u root api-test-1 < /srv/test/data/db-scaffold.sql

real	0m2.785s
user	0m0.033s
sys	0m0.188s

--

As you can see, there is, roughly, a 4x hit with MySQL 8 on these imports - alone.
[8 Feb 8:25] Frederic Descamps
Hi Jacob,

As I explained by chat, table creation can cost more as MySQL 8.0 now is atomic in that process. On fast storage and well tuned system, you barely notice that but indeed it's slower are by default this is more safe too.

However, I think your integration test should then take that in consideration and only measure what the application (queries) do and how that performs. In production, most of users don't recreate their data each time. And in a integration test as you explained to me, test if the application still works, the data creation might not be part of the result (but of course this is my own opinion). I would then suggest you to split the process of creating the data (certainly is all your 1000 tests are always recreating it), and check compare the result after that.
Please also pay attention that single threaded workload is generally slower than the previous version and the goal is to keep improving in multi-threaded workloads.

Cheers,
[8 Feb 13:40] MySQL Verification Team
Some small additions to Mr. Descamps comments.

First, you can use multi-threaded restore of data in 8.0, by using mysqlpump for dumping data.

Second, performance depends on tuning as well and 8.0 has far more tuning parameters than 5.7.

It is all fully explained in our Reference Manual, found on the http://dev.mysql.com.