Bug #67180 utf16 conversion error in db_mssql_grt.py
Submitted: 10 Oct 2012 20:27 Modified: 18 Jan 2017 22:22
Reporter: Mark Sizer Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Workbench: Migration Severity:S3 (Non-critical)
Version:5.2.44 CE OS:Windows (7)
Assigned to: CPU Architecture:Any
Tags: db_mssql_grt.py, error, migration, python, sql server, utf16

[10 Oct 2012 20:27] Mark Sizer
Description:
The migration tool fails on the "Reverse engineer selected schemata" step with a python error decoding UTF16. It is not clear _why_ it is trying to do anything in UTF16 to begin with. To the best of my knowledge, everything involved (MySQL and SQL Server side) is Latin1 encoded.

This is very similar to 66030, but a few lines later.

I tried various hacks on the Python code (including commenting out the "add_output_converter" lines) with no luck.

ODBC driver is "SQL Server". I have tried both SQL Server 2005 on Windows 2003 Server and SQL Server 2008 on Windows 2008. All source servers and databases are Latin1 collations.

The Log
- Reverse engineering catalog information
- Reverse engineering User Data Types...
- Preparing...
- Gathered stats for dbo
Reverse engineering 21 tables from dbo

Traceback (most recent call last):

  File "C:\Program Files (x86)\MySQL\MySQL Workbench CE 5.2.44\modules\db_mssql_grt.py", line 384, in reverseEngineer

    reverseEngineerTables(connection, schema)

  File "C:\Program Files (x86)\MySQL\MySQL Workbench CE 5.2.44\modules\db_mssql_grt.py", line 545, in reverseEngineerTables

    table_names = [(row[0], row[1]) for row in execute_query(connection, query % schema.name) ]

  File "C:\Program Files (x86)\MySQL\MySQL Workbench CE 5.2.44\modules\db_mssql_grt.py", line 130, in <lambda>

    con.add_output_converter(-150, lambda value: value if value is None else value.decode("utf-16"))

  File "C:\Program Files (x86)\MySQL\MySQL Workbench CE 5.2.44\Python\Lib\encodings\utf_16.py", line 16, in decode

    return codecs.utf_16_decode(input, errors, True)

UnicodeDecodeError: 'utf16' codec can't decode byte 0x2e in position 38: truncated data

Traceback (most recent call last):

  File "C:\Program Files (x86)\MySQL\MySQL Workbench CE 5.2.44\workbench\wizard_progress_page_widget.py", line 191, in thread_work

    self.func()

  File "C:\Program Files (x86)\MySQL\MySQL Workbench CE 5.2.44\modules\migration_schema_selection.py", line 160, in task_reveng

    self.main.plan.migrationSource.reverseEngineer()

  File "C:\Program Files (x86)\MySQL\MySQL Workbench CE 5.2.44\modules\migration.py", line 332, in reverseEngineer

    self.state.sourceCatalog = self._rev_eng_module.reverseEngineer(self.connection, self.selectedCatalogName, self.selectedSchemataNames, self.state.applicationData)

SystemError: UnicodeDecodeError("'utf16' codec can't decode byte 0x2e in position 38: truncated data"): error calling Python module function DbMssqlRE.reverseEngineer

ERROR: Reverse engineer selected schemata: UnicodeDecodeError("'utf16' codec can't decode byte 0x2e in position 38: truncated data"): error calling Python module function DbMssqlRE.reverseEngineer

Failed

How to repeat:
It happens with all SQL Server databases I have tried (4 of them). 

Just "Next" through the migration tool. I usually picked "Only one schema" from the "Schemata Selection" page, but I tried the others, too.
[25 Oct 2012 16:41] Armando Lopez Valencia
Hello Mark.
Can you please provide a copy of a MSSQL DB where you can reproduce this issue?
Along with some information of your environment such as: MySQL Server Version.
Thanks.
[25 Oct 2012 20:33] Mark Sizer
For the source SQL Server DB, give me a couple of days to get a sanitized version together.

As for the MySQL information:
MySQL Workbench 5.2 CE
MySQL Server 5.5

It all came from the mysql-installer-community-5.5.28.2.msi installer.
[1 Nov 2012 15:49] Mark Sizer
bug-67180-sql2005db.BAK.zip (with README.txt) uploaded to ftp.oracle.com/support/incoming. It's about 3MB, compressed.
[7 Nov 2012 15:19] Sergio Andres De La Cruz Rodriguez
Hi Mark:

Thanks for your bug report. As of the encoding, it's not how you encode your data in your source RDBMS but rather how we receive it from the ODBC driver. This encoding is defined in the ODBC specification as UCS2 which is superseded by UTF16 (more here: http://en.wikipedia.org/wiki/UTF-16).

From what I see in the traceback you posted, it seems that at least one of your tables contains non-ASCII characters. (Can you confirm this?) We don't support that for the moment, but we'll be working on having this ready ASAP.

I'll be downloading the sample DB you submitted to try to reproduce this issue.
[7 Nov 2012 15:43] Mark Sizer
I don't know if it survived the "sanitation" sweep I did on the uploaded database, but, yes, there is some non-Latin data in the database.

I have no idea if it will work here, but there is one row in the pss.SymbolSearchQueryParameter table that contains "迷你铜" as the SymbolSearchParameterValue.
[7 Nov 2012 18:40] Sergio Andres De La Cruz Rodriguez
Hmm... There's no problem AFAIK with non ASCII chars in the table data. That should be supported. What we don't support now is non ASCII chars in database object names (schema, table and column names, stored procedures/functions names, etc.). Do you have those too?
[7 Nov 2012 18:46] Mark Sizer
There _should_ not be any non-ASCII data in any schema-related names. Everything is checked into CVS, which gets very angry with non-ASCII.

It is vaguely possible that something was saved as ANSI - Windows does that some times - but it is not likely.

If there is anything, it should be in the uploaded backup. I just deleted data to make it smaller. I didn't drop or change any schema information.
[4 Jul 2013 13:29] Sergio Andres De La Cruz Rodriguez
Hi Mark:

I've tried migrating several MS SQL Server databases containing international characters without any issue using MySQL Workbench 6.0.2 beta. We have had several changes in our code since this bug was reported.

Can you confirm if you're still getting this in the most recent WB beta?
[31 Jul 2013 20:49] Mark Sizer
Exact same problem with 6.04 beta.
- Preparing...
- Gathered stats for dbo
- Gathered stats for pss
Reverse engineering 20 tables from dbo

Traceback (most recent call last):

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\modules\db_mssql_grt.py", line 384, in reverseEngineer

    reverseEngineerTables(connection, schema)

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\modules\db_mssql_grt.py", line 545, in reverseEngineerTables

    table_names = [(row[0], row[1]) for row in execute_query(connection, query % schema.name) ]

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\modules\db_mssql_grt.py", line 130, in <lambda>

    con.add_output_converter(-150, lambda value: value if value is None else value.decode('utf-16'))

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\Python\Lib\encodings\utf_16.py", line 16, in decode

    return codecs.utf_16_decode(input, errors, True)

UnicodeDecodeError: 'utf16' codec can't decode byte 0x2e in position 38: truncated data

Traceback (most recent call last):

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\workbench\wizard_progress_page_widget.py", line 192, in thread_work

    self.func()

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\modules\migration_schema_selection.py", line 160, in task_reveng

    self.main.plan.migrationSource.reverseEngineer()

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\modules\migration.py", line 335, in reverseEngineer

    self.state.sourceCatalog = self._rev_eng_module.reverseEngineer(self.connection, self.selectedCatalogName, self.selectedSchemataNames, self.state.applicationData)

SystemError: UnicodeDecodeError("'utf16' codec can't decode byte 0x2e in position 38: truncated data"): error calling Python module function DbMssqlRE.reverseEngineer

ERROR: Reverse engineer selected schemata: UnicodeDecodeError("'utf16' codec can't decode byte 0x2e in position 38: truncated data"): error calling Python module function DbMssqlRE.reverseEngineer

Failed

I notice it uses local ODBC drivers. I'm running Windows 7 Pro (up-to-date on Updates, as far as I know). The ODBC libraries are 6.1.7600 or 6.1.7601 versions.
[7 Aug 2013 0:30] Alfredo Kojima
Can you attach the output of the following MSSQL query?

SELECT t.name, p.value
FROM sys.tables t LEFT JOIN sys.extended_properties p ON p.major_id = t.object_id AND p.minor_id = 0 AND p.name = 'MS_Description' AND p.class_desc = 'OBJECT_OR_COLUMN'
WHERE schema_id = SCHEMA_ID('%s')

Replace %s with the name of the schema(s) you're trying to migrate. That is the query that is failing in your migration attempt.
[13 Aug 2013 14:14] Mark Sizer
I think I found it. Among other things, this is in the query results:
name	                              value
NewsUsageStatsAggregationTypeCdLu     For statistics with arguments, what the argument is. For example, a date/time might be a 'Begin' or an 'End'.

The single-quotes/apostrophes are most likely the problem. There is also this:

Statistic     A reported statistic "header" metadata.

Good luck (and I HATE this site; every time I respond I must reset my password because the rules make it impossible to remember - just saying),
-Mark
[26 Dec 2013 23:16] Frederic Viennot
I have the same issue. I did run the query mentioned earlier which gives an empty result.

- Retrieving table dbo.uniUnit...

Traceback (most recent call last):

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\modules\db_mssql_grt.py", line 384, in reverseEngineer

    reverseEngineerTables(connection, schema)

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\modules\db_mssql_grt.py", line 556, in reverseEngineerTables

    reverseEngineerTableColumns(connection, table)

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\modules\db_mssql_grt.py", line 616, in reverseEngineerTableColumns

    for row in rows:

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\modules\db_mssql_grt.py", line 130, in <lambda>

    con.add_output_converter(-150, lambda value: value if value is None else value.decode('utf-16'))

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\Python\Lib\encodings\utf_16.py", line 16, in decode

    return codecs.utf_16_decode(input, errors, True)

UnicodeDecodeError: 'utf16' codec can't decode byte 0x29 in position 86: truncated data

Traceback (most recent call last):

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\workbench\wizard_progress_page_widget.py", line 192, in thread_work

    self.func()

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\modules\migration_schema_selection.py", line 160, in task_reveng

    self.main.plan.migrationSource.reverseEngineer()

  File "C:\Program Files (x86)\MySQL\MySQL Workbench 6.0 CE\modules\migration.py", line 353, in reverseEngineer

    self.state.sourceCatalog = self._rev_eng_module.reverseEngineer(self.connection, self.selectedCatalogName, self.selectedSchemataNames, self.state.applicationData)

SystemError: UnicodeDecodeError("'utf16' codec can't decode byte 0x29 in position 86: truncated data"): error calling Python module function DbMssqlRE.reverseEngineer

ERROR: Reverse engineer selected schemata: UnicodeDecodeError("'utf16' codec can't decode byte 0x29 in position 86: truncated data"): error calling Python module function DbMssqlRE.reverseEngineer

Failed
[13 Jan 2014 21:21] Juan Luis Robles Vicente
Realizando una instalación mediante http://dev.mysql.com/downloads/installer/
Me da el mismo error: 
Reverse engineering 205 tables from dbo

Traceback (most recent call last):

  File "C:\Archivos de programa\MySQL\MySQL Workbench CE 6.0.8\modules\db_mssql_grt.py", line 384, in reverseEngineer

    reverseEngineerTables(connection, schema)

  File "C:\Archivos de programa\MySQL\MySQL Workbench CE 6.0.8\modules\db_mssql_grt.py", line 545, in reverseEngineerTables

    table_names = [(row[0], row[1]) for row in execute_query(connection, query % schema.name) ]

  File "C:\Archivos de programa\MySQL\MySQL Workbench CE 6.0.8\modules\db_mssql_grt.py", line 130, in <lambda>

    con.add_output_converter(-150, lambda value: value if value is None else value.decode('utf-16'))

  File "C:\Archivos de programa\MySQL\MySQL Workbench CE 6.0.8\Python\Lib\encodings\utf_16.py", line 16, in decode

    return codecs.utf_16_decode(input, errors, True)

UnicodeDecodeError: 'utf16' codec can't decode byte 0x73 in position 24: truncated data

Traceback (most recent call last):

  File "C:\Archivos de programa\MySQL\MySQL Workbench CE 6.0.8\workbench\wizard_progress_page_widget.py", line 192, in thread_work

    self.func()

  File "C:\Archivos de programa\MySQL\MySQL Workbench CE 6.0.8\modules\migration_schema_selection.py", line 160, in task_reveng

    self.main.plan.migrationSource.reverseEngineer()

  File "C:\Archivos de programa\MySQL\MySQL Workbench CE 6.0.8\modules\migration.py", line 353, in reverseEngineer

    self.state.sourceCatalog = self._rev_eng_module.reverseEngineer(self.connection, self.selectedCatalogName, self.selectedSchemataNames, self.state.applicationData)

SystemError: UnicodeDecodeError("'utf16' codec can't decode byte 0x73 in position 24: truncated data"): error calling Python module function DbMssqlRE.reverseEngineer

ERROR: Reverse engineer selected schemata: UnicodeDecodeError("'utf16' codec can't decode byte 0x73 in position 24: truncated data"): error calling Python module function DbMssqlRE.reverseEngineer

Failed
[27 Jan 2014 14:38] Milosz Bodzek
I'm trying reproduce this bug, but i've trouble to get access archived files. Can anyone upload a copy of a MSSQL DB where i can reproduce this issue?
If is bigger than 3mb, you can send mi directly on e-mail.
[12 Jun 2014 9:18] Steven MOORE
Just a comment that there is a very common open source project called SQL# (www.sqlsharp.com) that adds functions and stored procs to MSSQL.  The utility adds a schema called SQL#.  The migration fails because of the # sign.  When I uninstalled SQL#, the migration worked. 

Uninstall SQL# with this command:  EXEC SQL#.SQLsharp_Uninstall
[26 Aug 2015 12:58] MySQL Verification Team
Please try version 6.3.4. Thanks.
[27 Sep 2015 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[6 Dec 2016 7:40] toby tt
When i migrate data from mssql2014 to mysql【5.7.16】.get following error.  can anybody help me?
Traceback (most recent call last):
  File "E:\app\dev\MySQL Workbench 6.3.8 CE (winx64)\workbench\wizard_progress_page_widget.py", line 192, in thread_work
    self.func()
  File "E:\app\dev\MySQL Workbench 6.3.8 CE (winx64)\modules\migration_schema_selection.py", line 175, in task_reveng
    self.main.plan.migrationSource.reverseEngineer()
  File "E:\app\dev\MySQL Workbench 6.3.8 CE (winx64)\modules\migration.py", line 369, in reverseEngineer
    self.state.sourceCatalog = self._rev_eng_module.reverseEngineer(self.connection, self.selectedCatalogName, self.selectedSchemataNames, self.state.applicationData)
SystemError: UnicodeDecodeError("'utf16' codec can't decode bytes in position 6-7: illegal UTF-16 surrogate"): error calling Python module function DbMssqlRE.reverseEngineer
ERROR: Reverse engineer selected schemas: UnicodeDecodeError("'utf16' codec can't decode bytes in position 6-7: illegal UTF-16 surrogate"): error calling Python module function DbMssqlRE.reverseEngineer
Failed
[18 Dec 2016 22:22] MySQL Verification Team
Please try version 6.3.8. Thanks.
[19 Jan 2017 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".