Bug #116620 Suboptimal Startup Info Logging with Many Tables (1M+).
Submitted: 11 Nov 21:07 Modified: 18 Nov 6:47
Reporter: Jean-François Gagné Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Logging Severity:S3 (Non-critical)
Version:8.0.39, 8.0.40, 8.4.3, 9.1.0 OS:Any
Assigned to: CPU Architecture:Any

[11 Nov 21:07] Jean-François Gagné
Description:
Hi,

when starting MySQL with many tables and INFORMATION (3) log_error_verbosity [1], the content of the error log is suboptimal (to say the least).  I identified the problems below (in addition to what is reported in Bug#116619: Startup without Signs of Progress with Many Tables), which are expanded in greater detail in How to repeat.

[1]: https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_log_error_verb...

1. MY-012205, MY-012206 & MY-012207: "data files", '.ibd' and tablespace files;
2. MY-012207 & MY-012200: "Using n threads ..." and "Thread# i ...";
3. MY-012200 is spammy;
4. MY-012201 & MY-012207: "Checked x files" does not add-up;
5. MY-012208 & MY-012207: numbers do not match;
6. MY-012207 & MY-012208: "... to scan ..." and "Completed space ID check of ..." is confusing;
7. MY-012357: no completion, going straight to progress;
8. MY-012357: not really reading;
9. MY-012350: "4threads ..." missing space.

None of these problems are major, hence filing this as S3 / Non-critical.  Some of these might even be so minor that they would not deserve a bug report, but as fixing them touches the same log messages and the same code, it is probably worth fixing all at the same time (and maybe at the same time as Bug#116619).

I reported this for 8.0.39 because the logs I have in How to repeat are for this version (sorry, opening this bug is pending on my side for some time), but I think 8.0.40, 8.4.3 and 9.1.0 are also affected.

Because fixing this impacts logging format, this could be considered a "breaking change" and it might be only acceptable to fix this in MySQL 9.

I might eventually submit a patch for this.

Many thanks for looking into this,

Jean-François Gagné

How to repeat:
See Bug#115988 for a script to create many tables.

Start MySQL with INFORMATION (3) log_error_verbosity and many tables (1M+), the output should be similar to below (truncated for readability, full log attached).

2024-08-26T02:35:59.954165Z mysqld_safe Logging to '/home/jgagne/sandboxes/msb_mysql_8_0_39/data/msandbox.err'.
[...]
2024-08-26T02:36:04.166470Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
[...]
2024-08-26T02:36:04.292857Z 1 [Note] [MY-012203] [InnoDB] Directories to scan './'
2024-08-26T02:36:04.306208Z 1 [Note] [MY-012204] [InnoDB] Scanning './'
2024-08-26T02:36:14.306525Z 1 [Note] [MY-012205] [InnoDB] Files found so far: 418594 data files and 2 undo files
2024-08-26T02:36:24.306648Z 1 [Note] [MY-012205] [InnoDB] Files found so far: 858081 data files and 2 undo files
2024-08-26T02:36:27.504504Z 1 [Note] [MY-012206] [InnoDB] Found 1000002 '.ibd' and 2 undo files
2024-08-26T02:36:27.504569Z 1 [Note] [MY-012207] [InnoDB] Using 5 threads to scan 1000002 tablespace files
2024-08-26T02:36:37.505041Z 0 [Note] [MY-012200] [InnoDB] Thread# 3 - Checked 8279/250000 files
2024-08-26T02:36:37.505351Z 0 [Note] [MY-012200] [InnoDB] Thread# 0 - Checked 8274/250000 files
2024-08-26T02:36:37.505812Z 0 [Note] [MY-012200] [InnoDB] Thread# 2 - Checked 8273/250000 files
2024-08-26T02:36:37.506261Z 0 [Note] [MY-012200] [InnoDB] Thread# 1 - Checked 8280/250000 files
2024-08-26T02:36:47.505979Z 0 [Note] [MY-012200] [InnoDB] Thread# 3 - Checked 15657/250000 files
[...]
2024-08-26T02:41:57.536858Z 0 [Note] [MY-012200] [InnoDB] Thread# 1 - Checked 244948/250000 files
2024-08-26T02:42:04.302732Z 0 [Note] [MY-012201] [InnoDB] Checked 250000 files
2024-08-26T02:42:04.328662Z 0 [Note] [MY-012201] [InnoDB] Checked 250000 files
2024-08-26T02:42:04.350075Z 0 [Note] [MY-012201] [InnoDB] Checked 250000 files
2024-08-26T02:42:04.650573Z 0 [Note] [MY-012201] [InnoDB] Checked 250000 files
2024-08-26T02:42:04.758139Z 1 [Note] [MY-012208] [InnoDB] Completed space ID check of 1000004 files.
[...]
2024-08-26T02:42:05.565297Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
[...]
2024-08-26T02:42:06.718098Z 1 [Note] [MY-012357] [InnoDB] Reading DD tablespace files
2024-08-26T02:42:36.390311Z 0 [Note] [MY-012350] [InnoDB] 4threads have validated 330925 out of 1000006 tablespaces so far.
2024-08-26T02:42:46.390349Z 0 [Note] [MY-012350] [InnoDB] 4threads have validated 610870 out of 1000006 tablespaces so far.
2024-08-26T02:42:56.390389Z 0 [Note] [MY-012350] [InnoDB] 4threads have validated 828885 out of 1000006 tablespaces so far.
2024-08-26T02:43:05.290580Z 1 [Note] [MY-012356] [InnoDB] Scanned 1000006 tablespaces. Validated 1000006.
[...]
2024-08-26T02:43:09.535269Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
[...]
2024-08-26T02:43:09.702744Z 0 [System] [MY-010931] [Server] /home/jgagne/opt/mysql/mysql_8.0.39/bin/mysqld: ready for connections. Version: '8.0.39'  socket: '/tmp/mysql_sandbox8039.sock'  port: 8039  MySQL Community Server - GPL.

------------------------------------------------------------------------------
1. MY-012205, MY-012206 & MY-012207: "data files", '.ibd' and tablespace files

In the 3 log messages MY-012205, MY-012206 and MY-012207 (see below for relevant log lines), the same "concept" is referred to using 3 different names: data files, '.ibd', and tablespace files.  Using different names brings confusion.  I suggest using the same name to make it clear that these 3 messages are about the same thing ("data files" is the best name for me, and it has the advantage of being consistent with "undo files" of MY-012205).

2024-08-26T02:36:14.306525Z 1 [Note] [MY-012205] [InnoDB] Files found so far: 418594 data files and 2 undo files
2024-08-26T02:36:27.504504Z 1 [Note] [MY-012206] [InnoDB] Found 1000002 '.ibd' and 2 undo files
2024-08-26T02:36:27.504569Z 1 [Note] [MY-012207] [InnoDB] Using 5 threads to scan 1000002 tablespace files

-------------------------------------------------------------------
2. MY-012207 & MY-012200: "Using n threads ..." and "Thread# i ..."

In the log example, log message MY-012207 mentions "Using 5 threads", but the MY-012200 lines only mention Thread# 0 to 3 (see below for relevant log lines).  If we trust MY-012207 (we should not), it looks like a thread is missing, which is confusing.  I traced this to using "n_threads + 1" when logging MY-012207 [1].  This +1 makes sense for 0, but not for other values.  Note that this work can probably be merged with work for #1 above as it impacts the same log message (MY-012207).

[1]: https://github.com/mysql/mysql-server/blob/mysql-9.1.0/storage/innobase/fil/fil0fil.cc#L11...

2024-08-26T02:36:27.504569Z 1 [Note] [MY-012207] [InnoDB] Using 5 threads to scan 1000002 tablespace files
2024-08-26T02:36:37.505041Z 0 [Note] [MY-012200] [InnoDB] Thread# 3 - Checked 8279/250000 files
2024-08-26T02:36:37.505351Z 0 [Note] [MY-012200] [InnoDB] Thread# 0 - Checked 8274/250000 files
2024-08-26T02:36:37.505812Z 0 [Note] [MY-012200] [InnoDB] Thread# 2 - Checked 8273/250000 files
2024-08-26T02:36:37.506261Z 0 [Note] [MY-012200] [InnoDB] Thread# 1 - Checked 8280/250000 files

----------------------
3. MY-012200 is spammy

The log message MY-012200 is printed __**for each thread**__ every 10 seconds (see below for relevant log lines).  With many threads, this leads to n lines every 10 seconds which is not needed (logging for each thread is closer to debug logging than INFORMATION / progress logging).  I suggest logging a single line every 10 seconds with overall progress, not just the progress of the current thread (this is what is done for MY-012350).

2024-08-26T02:36:37.505041Z 0 [Note] [MY-012200] [InnoDB] Thread# 3 - Checked 8279/250000 files
2024-08-26T02:36:37.505351Z 0 [Note] [MY-012200] [InnoDB] Thread# 0 - Checked 8274/250000 files
2024-08-26T02:36:37.505812Z 0 [Note] [MY-012200] [InnoDB] Thread# 2 - Checked 8273/250000 files
2024-08-26T02:36:37.506261Z 0 [Note] [MY-012200] [InnoDB] Thread# 1 - Checked 8280/250000 files
[...]
2024-08-26T02:42:36.390311Z 0 [Note] [MY-012350] [InnoDB] 4threads have validated 330925 out of 1000006 tablespaces so far.

-----------------------------------------------------------
4. MY-012201 & MY-012207: "Checked x files" does not add-up

In log message MY-012201, the number of checked files for this thread is logged (4 lines with 250000 in this example).  But this does not match log message MY-012207 (1000002), which is confusing (see below for relevant log lines).  I suggest removing MY-012201, especially that this line is closer to debug logging than useful INFORMATION / progress logging (more about completion in #5).

2024-08-26T02:36:27.504569Z 1 [Note] [MY-012207] [InnoDB] Using 5 threads to scan 1000002 tablespace files
[...]
2024-08-26T02:42:04.302732Z 0 [Note] [MY-012201] [InnoDB] Checked 250000 files
2024-08-26T02:42:04.328662Z 0 [Note] [MY-012201] [InnoDB] Checked 250000 files
2024-08-26T02:42:04.350075Z 0 [Note] [MY-012201] [InnoDB] Checked 250000 files
2024-08-26T02:42:04.650573Z 0 [Note] [MY-012201] [InnoDB] Checked 250000 files

----------------------------------------------
5. MY-012208 & MY-012207: numbers do not match

In log message MY-012208, we have the number of files that have been "space ID checked" (1000004 in this example).  But this number does not match log message MY-012207 (1000002), which is confusing (see below for relevant log lines).  After checking the code, I understand that 1000004 matches the sum of data files and undo files from log line MY-012206, but the operator should be able to understand the logs without having to check the code.  I suggest changing MY-012208 to include numbers for data files and undo files (consistent information with MY-012206), and adding a MY-012207 for undo files.

2024-08-26T02:36:27.504504Z 1 [Note] [MY-012206] [InnoDB] Found 1000002 '.ibd' and 2 undo files
2024-08-26T02:36:27.504569Z 1 [Note] [MY-012207] [InnoDB] Using 5 threads to scan 1000002 tablespace files
[...]
2024-08-26T02:42:04.758139Z 1 [Note] [MY-012208] [InnoDB] Completed space ID check of 1000004 files.

----------------------------------------------------------------------------------------------
6. MY-012207 & MY-012208: "... to scan ..." and "Completed space ID check of ..." is confusing

In log message MY-012207, it is mentioned that tablespace files are "scanned", and in MY-012208 that "space ID check" is completed (see below for relevant log lines).  So 2 names are used to refer to the same thing, which is confusing.  I suggest using "space ID check" in both messages.  Note that this work can probably be merged with work for #1 and #2 above as it impacts the same log message (MY-012207).

2024-08-26T02:36:27.504569Z 1 [Note] [MY-012207] [InnoDB] Using 5 threads to scan 1000002 tablespace files
[...]
2024-08-26T02:42:04.758139Z 1 [Note] [MY-012208] [InnoDB] Completed space ID check of 1000004 files.

-------------------------------------------------------
7. MY-012357: no completion, going straight to progress

Log message MY-012357 indicates the start of InnoDB Tablespace Validation (not to be confused with Tablespace Path Validation, name my own, I will link in a comment to a blog post I am about to publish which define these).  The first part of this validation collects the list of tablespaces.  When this collection is completed, we go straight to progress logging (every 10 seconds) __**without**__ a log line indicating the completion of the collection (see below for relevant log lines).  I suggest adding a line indicating that tablespace collection is completed with the number of tablespaces found (like for MY-012206, and this would make it clear how much time is spent collecting the tablespace list).

2024-08-26T02:36:27.504504Z 1 [Note] [MY-012206] [InnoDB] Found 1000002 '.ibd' and 2 undo files
[...]
2024-08-26T02:42:06.718098Z 1 [Note] [MY-012357] [InnoDB] Reading DD tablespace files
2024-08-26T02:42:36.390311Z 0 [Note] [MY-012350] [InnoDB] 4threads have validated 330925 out of 1000006 tablespaces so far.
2024-08-26T02:42:46.390349Z 0 [Note] [MY-012350] [InnoDB] 4threads have validated 610870 out of 1000006 tablespaces so far.

--------------------------------
8. MY-012357: not really reading

Log message MY-012357 indicates "Reading DD tablespace files", but files are not actually "read" (see below for relevant log line).  A list of tablespaces is collected and file existence is checked later, but no tablespace file reading is actually done.  I would suggest making this clearer in the log line with "Gathering the list of tablespace files".  Note that this work can probably be merged with work for #7 above as it impacts the same log message (MY-012357).

2024-08-26T02:42:06.718098Z 1 [Note] [MY-012357] [InnoDB] Reading DD tablespace files

------------------------------------------
9. MY-012350: "4threads ..." missing space

Log message MY-012350 is missing a space in "4threads" (see below for relevant log line).  I would not fix this by adding a space.  I would suggest removing all reference to threads in this message.  With implementing #7 and #8 above, we can add a line indicating "starting to validate" and this line can include the number of threads (like for how it is done for log message MY-012207).

2024-08-26T02:42:36.390311Z 0 [Note] [MY-012350] [InnoDB] 4threads have validated 330925 out of 1000006 tablespaces so far.
[11 Nov 21:09] Jean-François Gagné
Full log file.

Attachment: msandbox.err.8.0.39.1000000.m6i_gp3_3kiops_125mbps.2.info.txt (text/plain), 22.83 KiB.

[11 Nov 21:27] Jean-François Gagné
Post where I define Tablespace Path Validation and other phases of a long MySQL startup with many tables.

- Title: Long and Silent / Stressful MySQL Startup with Many Tables
- Link: https://jfg-mysql.blogspot.com/2024/11/long-and-silent-stressful-mysql-startup.html#phases
[18 Nov 6:47] MySQL Verification Team
Hello Jean-François,

Thank you for the report and feedback.

regards,
Umesh