Bug #49663 migration toolkit won't use user defined charset and collation for table migrati
Submitted: 14 Dec 2009 10:11 Modified: 16 Dec 2009 11:18
Reporter: aiso haikens Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Migration Toolkit Severity:S2 (Serious)
Version:1.1.17 OS:Windows (XP pro sp 3)
Assigned to: CPU Architecture:Any

[14 Dec 2009 10:11] aiso haikens
Description:
I am migrating a database from mssql server to mysql.

In the tab 'object mapping' I choose the User Defined options for both the schema and tables to set the charset and collation myself. Since the mssql uses latin1 collation I want the mysql db also set to charset=latin1 and collation=latin1_swedish_ci. 
But the result is that all fields of my mysql tables are incorrectly set to utf-8! (although the database itself is set to latin1.

So basically I use these 'User defined' options for the schema:
charset=latin1, collation=latin1_swedish_ci
And for the 'User defined' options for the tables:
addAutoincrement=yes, charset=latin1, collation=latin1_swedish_ci, engine=INNODB

The sql output for the database creation is (which is correct!):
CREATE DATABASE IF NOT EXISTS `manifesta_umanif44`
  CHARACTER SET latin1 COLLATE latin1_swedish_ci

But the fields of tables is set incorrectly to utf-8 like this:
CREATE TABLE `manifesta_umanif44`.`Artists` (
  `ID` INT(10) NOT NULL AUTO_INCREMENT,
  `Table_ID` VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL,

How to repeat:
1. migrate a db from mssql to mysql
2. set 'User defined' option for tables to addAutoincrement=yes, charset=latin1, collation=latin1_swedish_ci, engine=INNODB
3. sql output will incorrectly show utf-8 charset for all fields of all tables.
[15 Dec 2009 9:45] Susanne Ebrecht
Many thanks for writing a bug report.

The described problem is an expected behaviour and not a bug.

If you change table default charset/collation then only new columns will get this default charset the existing columns stay in charset that they have had before.

If you want to change charset/collation for the columns then you need to change this explicit column by column.
[15 Dec 2009 10:27] aiso haikens
Ok, based on your comment I conclude the following (which I totally do not comprehend):
- it is absolutely useless to set the charset&collation on either database or table since the migration tool takes the standard utf-8 on each column while migrating data!

I am sure this is not the case, so either this conclusion is flawed or the behaviour of the tool doesn't make sense.

I am aware that it has a use in the way that it will serve as a default for new fields that are added by myself by hand, but we are talking about migrating data in some charset and collation by the tool itself not by hand.

And it would make sense if the original mssql db had utf-8 on each column, but I just checked and it is not the case: each column's collation of my mssql db is set to SQL_Latin1_General_CP1_CI_AS.

So basically, I have the same problem as before: why doesn't the migration tool use the collation on each column from the original mssql db which is set to SQL_Latin1_General_CP1_CI_AS? (or does the tool maybe convert the characters from latin1 to utf-8 while migrating???).
Or if this is not possible: where can I set charset and collation on each column to preserve the text while migrating? Since I do not see where to set this I think this is still a bug.
[16 Dec 2009 11:18] Susanne Ebrecht
The conversion will happen by automatism.