Bug #110177 clear invalid comment in table file and index when upgrade from 57 to 80
Submitted: 23 Feb 2023 6:28 Modified: 14 Aug 2023 16:10
Reporter: Huaxiong Song (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S4 (Feature request)
Version:8.0.32 OS:Any
Assigned to: Jon Stephens CPU Architecture:Any
Tags: Contribution

[23 Feb 2023 6:28] Huaxiong Song
Description:
Background
==========
In MySQL, the progress of upgrading from 5.7 to 8.0 will fail due to the existence of invalid string in comment of table, field and index. In most case, invalid string in comment is allowed to be cleared, which means that the upgrade process should continue insensitively.

User Interface
==============
upgrade_clear_invalid_comment
This variable is used to control if invalid character strings in comment of table, index and field are cleared during upgrading from 5.7 to 8.0. If true, the process will not fail for the invalid character.
- Scope: GLOABL, READ_ONLY
- Dynamic: NO
- Type: Bool
- Default: FALSE

Implementation
==============
When encountering invalid string in comments during upgrading from 5.7 to 8.0, if the variable "upgrade_clear_invalid_comment" is ON, the upgrade process will continue insensitively and the invalid string will be cleared automatically.
The messages of invalid string and cleared action will be recorded in error log.

How to repeat:
It's a feature request.
[23 Feb 2023 6:31] Huaxiong Song
code, test case and test data

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: data_57_clear_invalid_comment_string.zip (application/zip, text), 1.55 MiB.

[23 Feb 2023 6:31] Huaxiong Song
addition

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bugfix_upgrade_57_to_80_clear_invalid_comment.diff (application/octet-stream, text), 21.51 KiB.

[23 Feb 2023 6:43] MySQL Verification Team
Hello Huaxiong Song,

Thank you for the feature request and supplying contribution.

regards,
Umesh
[10 Aug 2023 11:37] Mrinal Pandey
Problem:
Upgrade from 5.7 to 8.0 fails with the message “invalid utf8mb3 character string” due to invalid comment in table, file and index

Analysis:
Character sets considered and tested (using 'set names <charset>') are - latin1, utf8mb3, utf8mb4 with the character ‘:dolphin:’
The character is acceptable in case of latin1 and utf8mb4 encoding but should not be acceptable in case of utf8mb3.
In version 5.7 of server, the character is accepted in all the charsets without an error (a bug in 5.7).
In version 8.0 of the server, the same character is strictly rejected in case of utf8mb3 character set.
The upgrade fails from version 5.7 of the server to version 8.0 owing to the fact that the default character set on 8.0 is utf8mb3.
Poor checks in 5.7 result in accepting the character in utf8mb3 encoding as well, the checks were refined in 8.0 to strictly reject the character and thus upgrade is also failed in this case.

Conclusion:
The behavior observed is as expected. The idea to remove entire comment string sounds risky as the comment string might contain important and relevant information to the user. Workaround for the user is to
modify comment fields to not use non-BMP characters and try the upgrade again.
[14 Aug 2023 16:02] Jon Stephens
Re-opening as a Docs bug and assigning to myself.
[14 Aug 2023 16:10] Jon Stephens
Per MPandey's comments above, I've updated https://dev.mysql.com/doc/refman/8.0/en/upgrading-from-previous-series.html#upgrade-config... to include this information.

Fixed in the MySQL 8.0 Manual, in mysqldoc rev 76466.

Closed.