Bug #115650 Include mysql-error.log NOTE on DD upgrade progress/stages during 57->8.0 upgrad
Submitted: 19 Jul 2024 13:00 Modified: 19 Jul 2024 14:03
Reporter: Marc Reilly Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Data Dictionary Severity:S4 (Feature request)
Version:8.0.38 OS:Any
Assigned to: CPU Architecture:Any
Tags: data dictionary, frm, SDI, upgrade, Upgrade mysql 8

[19 Jul 2024 13:00] Marc Reilly
Description:
When upgrading from mysql 5.7 to 8.0, Data dictionary migration is done. At a high level, this involves migrating legacy table metadata(FRM/PAR/TRG files) to new atomic data dictionary(DD). Currently there does not seem to be any information in the error log about this stage of the upgrade, or how it is progressing. When upgrading databases with many database objects this can take some time to complete, it would be very useful to have information which periodically prints NOTE messages of progress when log_error_verbosity is set to a higher level, such as log_error_verbosity=3. 

This will allow users track upgrade progress and understand if upgrade is stuck or progressing. 

Currently this is all that is printed for a 10k table sysbench installation with `log_error_verbosity=3`
Here you can see a message at 2024-07-19T12:26:13.500051Z indicating DD upgrade/migration has started. The next message is 3 minutes later at 2024-07-19T12:29:30.634250Z indicating DD migration/upgrade has completed. For larger amounts of objects this can take even longer, with little feedback to users on upgrade progress. 

2024-07-19T12:26:13.500051Z 1 [Note] [MY-010337] [Server] Created Data Dictionary for upgrade
2024-07-19T12:26:14.053183Z 0 [Note] [MY-011332] [Server] Plugin mysqlx reported: 'IPv6 is available'
2024-07-19T12:26:14.053407Z 0 [Note] [MY-011323] [Server] Plugin mysqlx reported: 'X Plugin ready for connections. bind-address: '::' port: 18039'
2024-07-19T12:26:14.053429Z 0 [Note] [MY-011323] [Server] Plugin mysqlx reported: 'X Plugin ready for connections. socket: '/tmp/mysqlx-18039.sock''
2024-07-19T12:29:30.634250Z 2 [System] [MY-011003] [Server] Finished populating Data Dictionary tables with data.

2024-07-19T12:29:30.717853Z 2 [Note] [MY-011008] [Server] Finished migrating TABLE statistics data.
2024-07-19T12:29:30.914549Z 2 [Note] [MY-011008] [Server] Finished migrating TABLE statistics data.
2024-07-19T12:29:33.010568Z 2 [Note] [MY-010006] [Server] Using data dictionary with version '80023'.

2024-07-19T12:29:33.457547Z 5 [System] [MY-013381] [Server] Server upgrade from '50700' to '80038' started.
2024-07-19T12:29:33.458425Z 5 [Note] [MY-013386] [Server] Running queries to upgrade MySQL server.
2024-07-19T12:29:37.763835Z 5 [Note] [MY-013387] [Server] Upgrading system table data.
2024-07-19T12:29:38.031783Z 5 [Note] [MY-013385] [Server] Upgrading the sys schema.
2024-07-19T12:29:38.602991Z 5 [Note] [MY-013400] [Server] Upgrade of help tables started.
2024-07-19T12:29:38.732456Z 5 [Note] [MY-013400] [Server] Upgrade of help tables completed.
2024-07-19T12:29:38.732572Z 5 [Note] [MY-013394] [Server] Checking 'mysql' schema.
2024-07-19T12:29:38.817740Z 5 [Note] [MY-013394] [Server] Checking 'sys' schema.
2024-07-19T12:29:38.834352Z 5 [Note] [MY-013394] [Server] Checking 'test' schema.
2024-07-19T12:30:24.393970Z 5 [System] [MY-013381] [Server] Server upgrade from '50700' to '80038' completed.

You can see above there is a little bit more info for the Server upgrade stage, but no info for DD migration stage. 

With default log_error_verbosity, it is even less obvious which is prob to be expected based on log levels:
2024-07-19T12:45:30.898715Z 1 [System] [MY-011012] [Server] Starting upgrade of data directory.
2024-07-19T12:45:30.898750Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2024-07-19T12:45:32.081313Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2024-07-19T12:48:50.225602Z 2 [System] [MY-011003] [Server] Finished populating Data Dictionary tables with data.
2024-07-19T12:48:53.039338Z 5 [System] [MY-013381] [Server] Server upgrade from '50700' to '80038' started.
2024-07-19T12:49:38.046199Z 5 [System] [MY-013381] [Server] Server upgrade from '50700' to '80038' completed.

https://dev.mysql.com/doc/refman/8.0/en/upgrading-what-is-upgraded.html
https://dev.mysql.com/doc/refman/8.0/en/upgrade-troubleshooting.html

How to repeat:
Load 10k tables, more tables/objects = longer upgrade time. 
./sandboxes/msb_5_7_44/sysbench --table-size=1 --create_secondary=0 --tables=10000 --threads=40 --mysql-db=test /usr/local/share/sysbench/oltp_write_only.lua  prepare

Upgrade from 5.7.44 to 8.0.38, where 8.0.38 has log_error_verbosity=3 and observe logs.

Suggested fix:
When log_error_verbosity=3 print progress indicators periodically during DD upgrade/migration. e.g. every 100/1000 tables or n seconds print objects proccessed or n/totalObjects as a progress indicator. 

Example:
2024-07-19T12:27:30.634250Z 2 [Note] [MY-010337] [Server] Created Data Dictionary for upgrade
2024-07-19T12:28:00.634250Z 2 [Note] [MY-xxxxxx] [Server] Populating Data Dictionary tables with data. 2500/10000 processed
2024-07-19T12:28:30.634250Z 2 [Note] [MY-xxxxxx] [Server] Populating Data Dictionary tables with data. 5000/10000 processed
2024-07-19T12:29:00.634250Z 2 [Note] [MY-xxxxxx] [Server] Populating Data Dictionary tables with data. 8000/10000 processed
2024-07-19T12:29:20.634250Z 2 [System] [MY-011003] [Server] Finished populating Data Dictionary tables with data.

May be worth considering some more "high-level" info when log_error_verbosity is at the default level too. 

Thanks!
Marc
[19 Jul 2024 13:06] MySQL Verification Team
Hi Mr. Reilly,

Thank you for your bug report.

However, at this moment we can not consider such a feature request.

Namely, 8.0.38 has a bug with many tables. This bug is also shared with 8.4.1 and 9.0.0.

There are already dozens of bug reports related to both bugs and the verbosity.

Please try 8.0.39 when it comes out and let us know if things have improved or not .......

Duplicate.
[19 Jul 2024 13:16] Marc Reilly
Hi,
This is unrelated to the 10k table issue 8.0.38/8.4.1/9.0.0, nor is it an availability issue. Its a feature req to improve observability during upgrade, and the same behavior can be observed on all 8.0 versions. I only used 10k tables as an example here, choose 5000 and there is still a lack of observability. 

Also, I don't believe this will affect 8.4.1/9.0.0 as the data dictionary migration is not much of an issue there, since dd is already migrated from legacy metadata such as FRM/TRG. 
There is also no upgrade path from 5.7 to 8.4/9.0[1], so all legacy metadata(FRM/TRG) should already be migrated in the 8.0 release. i.e. 8.4 upgrades, will all be coming from 8.0 which are already running on the new atomic data dictionary. 

If there is another bug report/feature request that this is a duplicate of can you link it?

Thanks!
Marc

[1] 
From MySQL 5.7 to an LTS or Innovation release	MySQL 5.7 to 8.4	
A bugfix or LTS series cannot be skipped, so in this example first upgrade MySQL 5.7 to MySQL 8.0, and then upgrade MySQL 8.0 to MySQL 8.4.
https://dev.mysql.com/doc/refman/8.4/en/upgrade-paths.html
[19 Jul 2024 13:32] MySQL Verification Team
Hi,

Since we are concentrating here on the new feature, could you share with us your idea on what would that feature look like. 

Do note that we can only use stdout as the output. Also, do note that during upgrade  process from 5.7,  there is no info on how many objects are there involved. Counting them, prior to upgrading, would just slow down the entire upgrade, which would cause a surge in bug reports. These bugs would then represent regression bug reports and would have to be fixed.

Hence, with those restrictions, please present to us the exact design of the feature that you would like to see.
[19 Jul 2024 14:03] Marc Reilly
Hey,
Sure! Even though it would be ideal, it's understandable that total object count may slow things down, but it would be nice to have some sort of messages/notes in the log that the dictionary tables are populating and if possible its progressing. I believe similar messages were added in 8.0 for checking the progress of innodb file path discovery during a regular startup. I understand its different for DD upgrade, just an example from other areas of the 8.0 code. 
e.g. with error_log_verbosity=3
2023-02-28T21:17:09.760368Z 1 [Note] [MY-012203] [InnoDB] Directories to scan './'
2023-02-28T21:17:09.760399Z 1 [Note] [MY-012204] [InnoDB] Scanning './'
2023-02-28T21:17:11.630209Z 1 [Note] [MY-012207] [InnoDB] Using 17 threads to scan 1000002 tablespace files
2023-02-28T21:17:21.630822Z 0 [Note] [MY-012200] [InnoDB] Thread# 14 - Checked 4274/62500 files
2023-02-28T21:17:21.631076Z 0 [Note] [MY-012200] [InnoDB] Thread# 13 - Checked 4287/62500 files
2023-02-28T21:17:21.631075Z 0 [Note] [MY-012200] [InnoDB] Thread# 1 - Checked 4287/62500 files
.
.
.
2023-02-28T21:19:55.852919Z 0 [Note] [MY-012201] [InnoDB] Checked 62500 files
2023-02-28T21:19:55.855481Z 0 [Note] [MY-012201] [InnoDB] Checked 62500 files
2023-02-28T21:19:55.856292Z 1 [Note] [MY-012208] [InnoDB] Completed space ID check of 1000004 files.

The "Created Data Dictionary for upgrade" and "Finished populating Data Dictionary tables with data." messages currently printed at start and finish, aren't very informative to users and give no information on progress.

Using the example in my initial report above:

When log_error_verbosity=3 print progress indicators periodically during DD upgrade/migration. e.g. every 100/1000 tables or n seconds print objects processed or n/totalObjects as a progress indicator. 

Example, printing num objects processed so far every 30 seconds. It can be documented how customers can estimate total object prior to upgrading. Using the progress indicators they can get an estimate on objects processed per sec/min and when its estimated to complete:
2024-07-19T12:27:30.634250Z 2 [Note] [MY-010337] [Server] Created Data Dictionary for upgrade
2024-07-19T12:28:00.634250Z 2 [Note] [MY-xxxxxx] [Server] Populating Data Dictionary tables with data. 2500 processed
2024-07-19T12:28:30.634250Z 2 [Note] [MY-xxxxxx] [Server] Populating Data Dictionary tables with data. 5000 processed
2024-07-19T12:29:00.634250Z 2 [Note] [MY-xxxxxx] [Server] Populating Data Dictionary tables with data. 8000 processed
2024-07-19T12:29:20.634250Z 2 [System] [MY-011003] [Server] Finished populating Data Dictionary tables with data.

Or even the following, just something to let users know the upgrade is progressing and not hung:
2024-07-19T12:27:30.634250Z 2 [Note] [MY-010337] [Server] Created Data Dictionary for upgrade
2024-07-19T12:28:00.634250Z 2 [Note] [MY-xxxxxx] [Server] Populating Data Dictionary tables with data.
2024-07-19T12:28:30.634250Z 2 [Note] [MY-xxxxxx] [Server] Populating Data Dictionary tables with data.
2024-07-19T12:29:00.634250Z 2 [Note] [MY-xxxxxx] [Server] Populating Data Dictionary tables with data.
2024-07-19T12:29:20.634250Z 2 [System] [MY-011003] [Server] Finished populating Data Dictionary tables with data.

Hope this helps. 

Thanks,
Marc
[19 Jul 2024 14:49] MySQL Verification Team
HI Mr. Reilly,

We are afraid that your definition of the feature is not feasible.

First of all, since multiple threads are involved, then writing to stdout and / or stderr is not possible without involving multiple mutexes. That would complicate a feature and would require many men-hours..

Next, knowing in advance a number of objects is also not possible, since these do not involve just schema and tables.

It is also not possible to provide any percentages of the completion, since it depends on dozens of factors. More important, the upgrade process does not know how many more objects have remained and how much time would each of them require.

Hence, can you come with a feasible solution ????/