| Bug #33649 | Is multi-byte encoding applied twice, leading to squared size ? | ||
|---|---|---|---|
| Submitted: | 3 Jan 2008 12:35 | Modified: | 13 Nov 2008 3:19 |
| Reporter: | Joerg Bruehe | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Charsets | Severity: | S2 (Serious) |
| Version: | 6.0.4-alpha | OS: | Any |
| Assigned to: | Sergei Glukhov | CPU Architecture: | Any |
[8 Oct 2008 11:09]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/55723 2859 Sergey Glukhov 2008-10-08 Bug#33649 Is multi-byte encoding applied twice, leading to squared size ? Some columns are declared in a wrong way, which results in this double length multiplication. The correct character length should be 64, and the correct octet length should be 256. The fix is to use NAME_CHAR_LEN instead of NAME_LEN
[8 Oct 2008 11:24]
Alexander Barkov
The patch http://lists.mysql.com/commits/55723 is ok to push.
[9 Oct 2008 10:18]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/55903 2862 Sergey Glukhov 2008-10-09 Bug#33649 Is multi-byte encoding applied twice, leading to squared size ? Some columns are declared in a wrong way, which results in this double length multiplication. The correct character length should be 64, and the correct octet length should be 256. The fix is to use NAME_CHAR_LEN instead of NAME_LEN
[10 Nov 2008 10:52]
Bugs System
Pushed into 6.0.8-alpha (revid:sergey.glukhov@sun.com-20081009101746-5cnojb55yibo2wpp) (version source revid:sergey.glukhov@sun.com-20081009101746-5cnojb55yibo2wpp) (pib:5)
[13 Nov 2008 3:19]
Paul DuBois
Noted in 6.0.9 changelog. The ROUTINES.DATA_TYPE, REFERENTIAL_CONSTRAINTS.SPECIFIC_SCHEMA, REFERENTIAL_CONSTRAINTS.SPECIFIC_NAME, REFERENTIAL_CONSTRAINTS.PARAMETER_NAME, REFERENTIAL_CONSTRAINTS.DATA_TYPE columns were declared longer than the maximum allowed identifier length.

Description: In the "*__datadict" tests in the 6.0.4 build (innodb__datadict, memory__datadict, myisam__datadict), I found a change (relative to the "result" file) which I attribute to the switch of "utf8" now being a 4-byte-per-char encoding (it was 3, previously). However, it seems this change (3 -> 4) may have been applied twice, and I propose this be checked. In "memory__datadict", the diff starts like this: *************** *** 15395,15413 AND table_name = 'parameters' ORDER BY ordinal_position; TABLE_CATALOG TABLE_SCHEMA TABLE_NAME COLUMN_NAME ORDINAL_POSITION COLUMN_DEFAULT IS_NULLABLE DATA_TYPE CHARACTER_MAXIMUM_LENGTH CHARACTER_OCTET_LENGTH NUMERIC_PRECISION NUMERIC_SCALE CHARACTER_SET_NAME COLLATION_NAME COLUMN_TYPE COLUMN_KEY EXTRA PRIVILEGES COLUMN_COMMENT STORAGE FORMAT For better readability, I have eliminated columns which seem to be irrelevant for this question, and so the condensed diff is: (Columns "TABLE_CATALOG", "TABLE_SCHEMA", "TABLE_NAME" were displayed as "NULL", "information_schema", "parameters"; columns "NUMERIC_PRECISION" and "NUMERIC_SCALE" have been removed, also "COLUMN_KEY", "EXTRA PRIVILEGES", "COLUMN_COMMENT", "STORAGE FORMAT"): *************** *** 15395,15413 AND table_name = 'parameters' ORDER BY ordinal_position; COLUMN_NAME ORDINAL_POSITION CHARACTER_MAXIMUM_LENGTH COLUMN_TYPE COLUMN_DEFAULT CHARACTER_OCTET_LENGTH IS_NULLABLE CHARACTER_SET_NAME DATA_TYPE COLLATION_NAME ! SPECIFIC_CATALOG 1 NULL YES varchar 4096 12288 utf8 utf8_general_ci varchar(4096) ! SPECIFIC_SCHEMA 2 NO varchar 192 576 utf8 utf8_general_ci varchar(192) ! SPECIFIC_NAME 3 NO varchar 192 576 utf8 utf8_general_ci varchar(192) ORDINAL_POSITION 4 0 NO int NULL NULL NULL NULL int(21) ! PARAMETER_MODE 5 NULL YES varchar 5 15 utf8 utf8_general_ci varchar(5) ! PARAMETER_NAME 6 NULL YES varchar 192 576 utf8 utf8_general_ci varchar(192) ! DATA_TYPE 7 NO varchar 192 576 utf8 utf8_general_ci varchar(192) CHARACTER_MAXIMUM_LENGTH 8 NULL YES int NULL NULL NULL NULL int(21) CHARACTER_OCTET_LENGTH 9 NULL YES int NULL NULL NULL NULL int(21) NUMERIC_PRECISION 10 NULL YES int NULL NULL NULL NULL int(21) --- 15668,15686 AND table_name = 'parameters' ORDER BY ordinal_position; COLUMN_NAME ORDINAL_POSITION CHARACTER_MAXIMUM_LENGTH COLUMN_TYPE COLUMN_DEFAULT CHARACTER_OCTET_LENGTH IS_NULLABLE CHARACTER_SET_NAME DATA_TYPE COLLATION_NAME ! SPECIFIC_CATALOG 1 NULL YES varchar 4096 2048 utf8 utf8_general_ci varchar(4096) ! SPECIFIC_SCHEMA 2 NO varchar 256 1024 utf8 utf8_general_ci varchar(256) ! SPECIFIC_NAME 3 NO varchar 256 1024 utf8 utf8_general_ci varchar(256) ORDINAL_POSITION 4 0 NO int NULL NULL NULL NULL int(21) ! PARAMETER_MODE 5 NULL YES varchar 5 20 utf8 utf8_general_ci varchar(5) ! PARAMETER_NAME 6 NULL YES varchar 256 1024 utf8 utf8_general_ci varchar(256) ! DATA_TYPE 7 NO varchar 256 1024 utf8 utf8_general_ci varchar(256) CHARACTER_MAXIMUM_LENGTH 8 NULL YES int NULL NULL NULL NULL int(21) CHARACTER_OCTET_LENGTH 9 NULL YES int NULL NULL NULL NULL int(21) NUMERIC_PRECISION 10 NULL YES int NULL NULL NULL NULL int(21) Note that columns "SPECIFIC_SCHEMA", "SPECIFIC_NAME", "PARAMETER_NAME", and "DATA_TYPE" have had 1) their "CHARACTER_MAXIMUM_LENGTH" increased by 4/3 (192 = 64 * 3 -> 256 = 64 * 4) *and* simultaneously 2) their "CHARACTER_OCTET_LENGTH" from 3 * "CHARACTER_MAXIMUM_LENGTH" to 4 * "CHARACTER_MAXIMUM_LENGTH". I assume item 2) is necessary for their utf8 encoding, but item 1) seems to indicate the column length already allows for multi-byte encoding. How to repeat: Found by looking at the test failure. Suggested fix: Check whether really both size increases are needed.