Bug #37028 problem with international chars in tables
Submitted: 28 May 2008 5:41 Modified: 17 Jun 2008 14:19
Reporter: c. hoehne Email Updates:
Status: Can't repeat Impact on me:
Category:MySQL Migration Toolkit Severity:S1 (Critical)
Version:1.1.12 OS:Any
Assigned to: CPU Architecture:Any

[28 May 2008 5:41] c. hoehne
When migrating from MS Access to MySQL a problem occurs.
Creation of tables and relationships succeeds.
But when it comes to migrate the contents of a table only fields with ascii chars in it are migrated successfull.
Fields with international charachters are truncated.
testütest (testütest)
will be migrated as only test

If the field is a primary key or a  field with a unique restriction there is a chance that the complete table is not migrated because there are duplicate keys (due to truncation)

The migration process falsely reports no problem

How to repeat:
Create a small access db with one table and one field make this field the primary key.
enter some values in the table do not forget to mix international keys in it.

Try to migrate
[28 May 2008 7:24] c. hoehne
Have uploaded bug-data-37028.zip to ftp://ftp.mysql.com/pub/mysql/upload/
The zip file contains a ms access Database for testing purpose.

I have discoverd that this bug does not occur with very small tables (e.g. i have tried to create the same test database with 10 rows per table. But this db migrates well)
[28 May 2008 12:55] MySQL Verification Team
Thank you for the bug report. Could you please start the server with the option: --max_allowed_packet=16M i.e:

c:\dbs>5.0\bin\mysqld --standalone --console --max_allowed_packet=16M
080528  9:41:07  InnoDB: Started; log sequence number 0 45344
080528  9:41:07 [Note] 5.0\bin\mysqld: ready for connections.
Version: '5.0.62-nt'  socket: ''  port: 3306  Source distribution

and try again. Thanks in advance.
[29 May 2008 5:53] c. hoehne
Thanks. I tried your suggestion.
After starting mysql i checked the system variables. The variable was correctly set.
But even with my test database, i get the same behaviour. Fields are truncated
But i discovered that i have overlooked this in the 'migration complete' report:

          Duplicate entry 'DEU; Location: Büchen; ' for key 1
          0 row(s) transfered.

          Duplicate entry 'DEU; Location: Büchen; -GBIF' for key 1
          0 row(s) transfered.
[30 May 2008 15:17] MySQL Verification Team
Thank you for the feedback. Please point a row which was truncated on your test and then I will check here. Thanks in advance.
[1 Jun 2008 9:02] c. hoehne
Please search in the ms access database in table LOCATION for primary key rows starting with "DEU; Location: B". You will discover that some rows are the same when  they are compared upto the first german umlaut. These are the rows which get truncated.

As previously described, if i put these rows in a very small table(upto approx. 10 rows) they get migrated well.
Very dubious.
[13 Jun 2008 11:49] c. hoehne
A shorter access DB with the problem

Attachment: testbug37028short.zip (application/zip, text), 245.01 KiB.

[13 Jun 2008 12:04] c. hoehne
I have attached a Zip file which contains a ms access DB with only one Table.
The table LOCATION does only consists of  203 Rows. The interesting field of the table is Location.
When this db is migrated this error is displayed at the end of the process:
          Duplicate entry 'DEU; Location: Büchen; ' for key 1
          0 row(s) transfered.

Because of the fact that the field LocationID is a primary key there are no duplicates in the table, for sure.
Row 73 contains the primary key 'DEU; Location: Bäck;'
but this is truncated to 'DEU; Location: B;'
When the System want to enter row 164 with key 'DEU; Location: Büchen;' this also gets truncated and hence the error message.

We have to migrate some large Access DBs with infos about our problem domain (vcrop wild relatives) So this bug really blocks the migration process. We are quite sure that this is not limited to german umlauts. It happens with all characthers not in the ascii charset.
What we need is astatement if this bug can be reproduced and if so, how long does it take to fix it.

[16 Jun 2008 20:57] MySQL Verification Team
Thank you for the feedback. I can't repeat below the report of the migration process:

-- MySQL Migration Toolkit Report                                             --
--                                                                            --
-- Title:   Summary Of The Migration Process                                  --
-- Date:    2008-06-16 17:54                                                  --

1. Schema Migration

  Number of migrated schemata: 1

  Schema Name:            testbug37028short
  - Tables:               1
  - Views:                0
  - Routines:             0
  - Routine Groups:       0
  - Synonyms:             0
  - Structured Types:     0
  - Sequences:            0


  - Tables
  - Views
  - Routines
  - Routine Groups
  - Synonyms
  - Structured Types
  - Sequences

2. Data Bulk Transfer

          203 row(s) transfered.

End of report.
[17 Jun 2008 14:19] c. hoehne
Have you done any configuration in MS Access?
The process depends apparently on the size of the database. Have you tested it with the larger db in the upload area? 
The attached one is only a subset from the larger one.

You can not reproduce, so we can not migrate our access databases with the migration tool.
Luckily we have an self programmed VBA function which does the job of migration , except for the schema creation. So our workaround is now:
1. Create schema with MySql Migrationtoolkit, but do not transfer any data.
2. Transfer the data with our own tool.