Bug #110891 Documentation bug: CHAR vs VARCHAR
Submitted: 3 May 2023 10:50 Modified: 5 May 2023 11:57
Reporter: Ilya Kantor Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: char, storage, varchar

[3 May 2023 10:50] Ilya Kantor
Description:
There are at least two relevant documentation  pages:
- https://dev.mysql.com/doc/refman/8.0/en/storage-requirements.html
- https://dev.mysql.com/doc/refman/8.0/en/innodb-row-format.html

Both assume that VARCHAR uses 1 byte for storage if length is < 255, otherwise it may need 2 bytes.

Also it's assumed that CHAR takes more storage for utf8mb4 charset, compared to latin1.

I tried to test it using hexdump on .idb files, using tables with 3 types:
1) VARCHAR(3)
2) CHAR(3)
3) CHAR(3) CHARACTER SET latin1

Then I filled them with same data ("EUR" string).

The findings were:
1) VARCHAR(3) takes exactly the same storage space as CHAR(3), expected: 1 byte more.
2) CHAR(3) in latin1 takes 1 byte less than the others, expected to take 4 times less than CHAR(3) in utf8mb4 due to encoding.

After consultation with an expert who worked in MySQL, they said the docs were written for MyISAM, and for MyISAM the information is correct, but not for InnoDB.

How to repeat:
Surely, your specialists know InnoDB storage and can immediately see how to correctly fix the docs.

It would be very kind from you to give a brief explanation of what's going on in reply.
[5 May 2023 11:57] MySQL Verification Team
Hello Ilya Kantor,

Thank you for the report and feedback.

regards,
Umesh