Bug #99399 mysqlpump hang at starting dumpprocess
Submitted: 29 Apr 2020 15:01 Modified: 14 Aug 2020 13:56
Reporter: Tobias Stock Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: mysqlpump Command-line Client Severity:S5 (Performance)
Version:5.7.28 OS:Windows
Assigned to: MySQL Verification Team CPU Architecture:Any

[29 Apr 2020 15:01] Tobias Stock
Description:
After starting the dumpprocess mysqlpump needs about 1 minute to start with work regarding mysqldump with starts immediately. We testeed two scenarios:

1. Only one database in MySQL with 175 KB. 
a) MySQL Pump mydbname_01 175 KB = 3195 ms
b) MySQL Dump mydbname_01 175 KB = 1800 ms

2. About 30 databases in MySQL starting from 10 KB to 2.5 GB.
MySQL Pump mydbname_01 175 KB = 63310 ms
MySQL Dump mydbname_01 175 KB = 1800 ms

In the second scenario in our case mysqlpump took more than a minute to start. After starting the dumpprocess the database was dumped in less than 2 seconds.

How to repeat:
Take a coupple of databases an start the dumpprocess with mysqlpump as following:

mysql.exe -h %host% -P %port% -u%mysqluser% -p%mysqlpassword% -s -N -e "SHOW DATABASES" | for /F "usebackq" %%D in (`findstr /V "mysql information_schema performance_schema"`) do mysqlpump --databases %%D -h %host% -P %port% -u%mysqluser% -p%mysqlpassword% --default-character-set=utf8mb4 --skip-defer-table-indexes --add-drop-database --skip-definer --single-transaction > %workdir%\%%D-%datum%.sql

We need to use this process because if one database needs to be restored we could not restore the whole pumpfile. On the other hand we need to use mysqlpump instead of mysqldump because mysqldump does not support --skip-defer-table-indexes and --skip-definer.
[4 May 2020 14:48] MySQL Verification Team
Hi,

I'm having trouble reproducing this.

If you do just:

mysqlpump --databases %%D -h %host% -P %port% -u%mysqluser% -p%mysqlpassword% --default-character-set=utf8mb4 --skip-defer-table-indexes --add-drop-database --skip-definer --single-transaction --exclude-databases=mysql,information_schema,performance_schema   --result-file=%workdir%\%%D-%datum%.sql

is it still slow to start?
[5 Jun 2020 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[16 Jun 2020 13:59] Tobias Stock
If I do the pump process with your command it starts immediately. But I get a single dumpfile with all databases of the mysql-server. Is it possible to restore a single database out of this complete file without restoring also all other databases?

Example - we have 10 databases on the mysql-server.
db1
db2
db3
...
db10

Lets say somethink went wrong in db1 and I need to restore only db1. How can I do this with pump?
[16 Jun 2020 14:53] MySQL Verification Team
Hi,

I asked so I can try to determine where the problem is as I can't reproduce it locally.

If normally it runs "immediately" the issue is with

mysql.exe -h %host% -P %port% -u%mysqluser% -p%mysqlpassword% -s -N -e "SHOW DATABASES" | for /F "usebackq" %%D in (`findstr /V "mysql information_schema performance_schema"`) do

part. Sinc I doubt "for /F..." will be slow the problem is most probably

mysql.exe -h %host% -P %port% -u%mysqluser% -p%mysqlpassword% -s -N -e "SHOW DATABASES"

Can you measure how long it takes to

mysql.exe -h %host% -P %port% -u%mysqluser% -p%mysqlpassword% -s -N -e "SHOW DATABASES"

Also, please measure on the same host

mysql.exe -h %host% -P %port% -u%mysqluser% -p%mysqlpassword% -s -N -e "SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA"

The latter should be much faster and should be solution for your problem.

The "SHOW.." will in future be replaced with query from I_S that should be much faster.

Thanks
Bogdan
[17 Jun 2020 11:36] Tobias Stock
Hi,

if I do mysql.exe -h %host% -P %port% -u%mysqluser% -p%mysqlpassword% -s -N -e "SHOW DATABASES" it takes maximum 2 seconds to list our databases.

The second command: mysql.exe -h %host% -P %port% -u%mysqluser% -p%mysqlpassword% -s -N -e "SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA" only take 1 second to finish.

But I have seen that it took about 90 seconds to save the *.sql file. Meens I start the pumpprocess with:

mysqlpump --databases %db% -h %host% -P %port% -u%mysqluser% -p%mysqlpassword% --default-character-set=utf8mb4 --skip-defer-table-indexes --add-drop-database --skip-definer --single-transaction > %workdir%\%db%-%datum%.sql

pump saves the tables like this:
Dump progress: 1/4 tables, 0/1 rows
Dump progress: 48/165 tables, 58571/2488108 rows
Dump progress: 99/165 tables, 102239/2488108 rows
Dump progress: 148/165 tables, 389351/2488108 rows
Dump progress: 164/165 tables, 692878/2488108 rows
Dump progress: 164/165 tables, 947128/2488108 rows
Dump progress: 164/165 tables, 1183878/2488108 rows
Dump progress: 164/165 tables, 1521378/2488108 rows
Dump progress: 164/165 tables, 1768378/2488108 rows
Dump progress: 164/165 tables, 2065378/2488108 rows
Dump progress: 164/165 tables, 2425128/2488108 rows

At the point "164/165 tables, 2425128/2488108 rows" the commandline interrupts about 90 seconds till this line is shown:
Dump completed in 110732 milliseconds

The *.sql file is 0KB while the 90 seconds.

When finishing this process the next database is saved with pump.
[17 Jun 2020 11:43] MySQL Verification Team
Hi,

Well you are doing

mysql -> windows -> mysql

Both mysql operations you tested and are fast, what's left is that something is wrong with your windows. Antivirs or whaver else is creating the delay. When I do the same thing on my windows test box I see no such delay.

all best
Bogdan
[18 Jun 2020 11:40] Tobias Stock
Were you able to reproduce the issue?

The command is 
"mysqlpump --databases %db% -h %host% -P %port% -u%mysqluser% -p%mysqlpassword% --default-character-set=utf8mb4 --skip-defer-table-indexes --add-drop-database --skip-definer --single-transaction > %workdir%\%db%-%datum%.sql"

Please keep in mind, the seemingly hanging pump process only happens when there are multiple databases in the DB-instance. They might also have to be somewhat large for the problem to occur. Our data directory including all the databases in this instance is 12GB. 

When we take one of our databases and put them in a empty instance, the export runs fast. Pump then is still slightly slower than dump but not by much.
[18 Jun 2020 12:27] MySQL Verification Team
Hi,

As I said, no, I am not able to reproduce the problem. I created a test database on windows housing 150 databases with 200 tables each filled with dummy data. I have no delay executing the dump. That is why I assume there's something in your Windows messing things up (most commonly this type of issues are 'caused by the antivirus)

kind regards
Bogdan
[14 Aug 2020 7:55] Tobias Stock
Hello,
for eliminating issues caused by Windows, virus scanner domain policies or third party software of any kind, we set up a new system with only Windows and MySQL 5.7.31.
Then we copied the databases to the new system. We encounter the same problem as on the original system. A mysqldump of a 100 kb database takes about 2 seconds. With mysqlpump the duration is about 50 Seconds. After turning on logging, we saw the following:
mysqldump produces about 1150 SQL queries. Mysqlpump leads to 90000 queries.
There are 73 databases in the system with 1- 1114 tables, most have 100-300 tables. Although we defined a backup for a single database using the --databases parameter, mysqlpump starts queries to each table in every database. Examples:
 SHOW TABLE STATUS, SHOW CREATE TABLE, SHOW COLUMNS. So most of the time is spent in unnecessary queries before doing the actual dump. Is there a possibility to prevent this behaviour?
[14 Aug 2020 12:23] MySQL Verification Team
> Is there a possibility to prevent this behaviour?

no.
[14 Aug 2020 13:56] Tobias Stock
Hello,
the tremendous number of unnecessary SQL queries compared to mysqldump causes the performance decrease in our situation. Can the performance be improved in future releases? 
Adding some logic at this spot would increase the performance for large instances significantly. 

Thank you in advance, Tobias