MySQL Bugs: #64447: Value editor (Edit table data) cannot save Unicode data, no warning/feedback

Bug #64447	Value editor (Edit table data) cannot save Unicode data, no warning/feedback
Submitted:	24 Feb 2012 18:21	Modified:	27 Jul 2012 8:05
Reporter:	Craig Fowler	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Workbench: SQL Editor	Severity:	S3 (Non-critical)
Version:	5.2.38	OS:	Linux (Debian using 11.04 .deb installer)
Assigned to:		CPU Architecture:	Any
Tags:	edit-table-data, Unicode

Description:
A somewhat obscure issue but - if using the SQL editor to "Edit table data" and enter any Unicode data into the Value editor, then hit Apply - this data is not correctly saved.

This seems to cause some kind of encoding error in the Table data editor that (in turn) causes the row to be skipped when "Apply" is used to save the edits.

I can't reproduce this when not using the value editor though, typing/pasting Unicode directly into the edit table data UI works just fine.

Apologies in-advance for the wall of text that will be the "how to reproduce" steps :)

How to repeat:
Execute the following SQL in an empty schema:

-- --- Begin SQL script

CREATE TABLE character_test (
character_test_id INT,
latin_text TEXT CHARACTER SET latin1,
utf8_text TEXT CHARACTER SET utf8,
PRIMARY KEY(character_test_id)
);

INSERT INTO character_test (character_test_id, latin_text, utf8_text)
VALUES (1, '', '');

SELECT * FROM character_test;

-- --- End SQL script

This will bring up an editable version of that table, containing a row with PK 1 and empty TEXT columns.

Right click the latin_text (shouldn't be capable of storing UTF-8 characters anyway but read on, it happens on the utf8_text column as well) column and choose to "Open Value in Editor". Type or paste in a Unicode character. On Linux I used the compose-key sequence [<compose> o,c] in order to get the © character. I have been able to repeat it with a few characters I can think of that is outside the ASCII range though, such as ½ and å.

As an aside - curiously, I cannot repeat it with the ™ character. It gets substituted as I type/paste it to the character sequence "(TM)". The same is true of the character €, which gets replaced with "EUR". It is as if the text-editor component in use has some kind of configured list of replacements that get made on-the-fly?

As the text is pasted/typed into the value editor, everything appears to be fine, there are no rendering issues visible and no sign that anything is wrong. Upon clicking "Apply" in the value editor and then closing it, if the Unicode character(s) are near the beginning of the entered text then the first sign that something is amiss MIGHT be visible in the main edit table data UI, as any affected Unicode characters will be rendered there as a cross-within-a-rectangle character. However, if you have pasted/typed a lot of content then it is quite possible that any rendering errors are present "below the fold" at which the data in that table editor UI is truncated and shows "..." .

If you then click "Apply" on the edit table data UI you will notice that no SQL is generated for updating the table. This encoding issue stops the edit-table-data system from generating an INSERT or UPDATE statement for any affected rows, although it does not (at this stage) show any kind of error message.

There is an issue here: If you are editing/inserting only a few rows then perhaps you might not notice that not all of the edits you made are being carried over in the INSERT/UPDATE statement.

Also, if - instead of clicking Apply on the Edit table data UI - you re-open the value editor against that column of data (that now contains one-or-more invalid Unicode characters), you will see that the value editor refuses to render anything and shows the warning feedback "Data could not be converted to UTF-8 text: Invalid byte sequence in conversion input". This is the only scenario in which I could get anything to give me any kind of definitive warning/feedback that something was wrong. It is only possible to see this by closing/re-opening the value editor though, after inserting one or more Unicode characters that cause this error.

Finally - I at first thought this was because the character-set of the column I was editing was not Unicode (and thus it was going to have problems saving Unicode data anyway). If you repeat these steps editing the utf8_text column, you will find that it behaves the same way. So - it's nothing to do with the character encoding of the target column to hold the data.

Thanks for the report Craig.
Reproduced as described by the reporter.
WB 5.2.38 rev 8753
Ubuntu 11.10x64

Same issue in W7.  Table view shows unicode characters correctly; open value editor and unicode characters are displayed as "?".  Typing or pasting unicode characters in editor results in "?" in editor.

Fixed as of Workbench 5.2.41, and here's the changelog entry:

The Edit table data SQL editor option would
not properly display or save Unicode characters.