Bug #29182 MyISAMCHK reports wrong character set
Submitted: 18 Jun 2007 19:15 Modified: 30 Mar 2008 9:43
Reporter: Peter Zaitsev (Basic Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S3 (Non-critical)
Version:5.1.18 OS:Any
Assigned to: Ingo Strüwing CPU Architecture:Any
Tags: qc

[18 Jun 2007 19:15] Peter Zaitsev
Description:
Show create table shows:

CREATE TABLE `charjoin_myisam_unpacked` (
  `i` varchar(10) NOT NULL,
  `c` char(10) DEFAULT NULL,
  `j` varchar(10) NOT NULL,
  KEY `i` (`i`),
  KEY `j` (`j`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 PACK_KEYS=0

MyISAMCHK shows at the same time:

[root@WEB02 test]# /usr/local/mysql/bin/myisamchk -dvv charjoin_myisam.MYI

MyISAM file:         charjoin_myisam.MYI
Record format:       Packed
Character set:       latin1_swedish_ci (8)
File-version:        1
Creation time:       2007-06-18 19:02:39
Recover time:        2007-06-18 19:02:41
Status:              checked,analyzed
Data records:               262144  Deleted blocks:                 0
Datafile parts:             262144  Deleted data:                   0
Datafile pointer (bytes):        6  Keyfile pointer (bytes):        6
Datafile length:           7335528  Keyfile length:           2632704
Max datafile length: 281474976710654  Max keyfile length: 288230376151710719
Recordlength:                   94

table description:
Key Start Len Index   Type                     Rec/key         Root  Blocksize
1   2     30  multip. varchar prefix                 3      1534976       1024
2   63    30  multip. varchar prefix                26      2631680       1024

Field Start Length Nullpos Nullbit Type
1     1     1
2     2     31                     varchar
3     33    30     1       1       no endspace
4     63    31                     varchar

Not Latin1 character set not utf8 as we can see in show create table. 

How to repeat:
See above
[18 Jun 2007 23:33] MySQL Verification Team
Thank you for the bug report.
[23 Jan 2008 18:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/41163

ChangeSet@1.2657, 2008-01-23 19:18:18+01:00, istruewing@stella.local +3 -0
  Bug#29182 - MyISAMCHK reports wrong character set
  
  myisamchk did always show Character set: latin1_swedish_ci (8),
  regardless what DEFAULT CHARSET the table had.
  
  When the server created a MyISAM table, it did not copy the
  characterset number into the MyISAM create info structure.
  
  Added assignment of charset number to MI_CREATE_INFO.
[23 Jan 2008 18:31] Ingo Strüwing
The patch can also be applied to 5.0. Please advise.
[24 Jan 2008 11:28] Sergey Vojtovich
Ok to push.
[24 Jan 2008 17:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/41236

ChangeSet@1.2657, 2008-01-24 18:56:42+01:00, istruewing@stella.local +3 -0
  Bug#29182 - MyISAMCHK reports wrong character set
  
  myisamchk did always show Character set: latin1_swedish_ci (8),
  regardless what DEFAULT CHARSET the table had.
  
  When the server created a MyISAM table, it did not copy the
  characterset number into the MyISAM create info structure.
  
  Added assignment of charset number to MI_CREATE_INFO.
[25 Jan 2008 12:11] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/41253

ChangeSet@1.2785, 2008-01-25 13:11:29+01:00, istruewing@stella.local +1 -0
  Bug#29182 - MyISAMCHK reports wrong character set
  Post-merge fix.
  In 6.0 UTF-8 has a different character set number.
  In 6.0 UTF-8 characters count as four bytes instaed of three.
[26 Jan 2008 17:08] Ingo Strüwing
Queued to 6.0-engines, 5.1-engines
[27 Mar 2008 11:17] Bugs System
Pushed into 5.1.24-rc
[27 Mar 2008 17:48] Bugs System
Pushed into 6.0.5-alpha
[30 Mar 2008 9:43] Jon Stephens
Documented in the 5.1.23-ndb-6.3.11, 5.1.24, and 6.0.5 changelogs as follows:

        myisamchk always reported the character set for a table as
        latin1_swedish_ci (8) regardless of the table' actual character
        set.