Bug #62453 Workbench deletes 4byte utf8 symbols without notification and inserts the string
Submitted: 16 Sep 2011 20:33 Modified: 17 Sep 2011 9:49
Reporter: denixx baykin Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Workbench Severity:S3 (Non-critical)
Version:5.2.34 OS:Windows (Win XP x32)
Assigned to: CPU Architecture:Any
Tags: UTF-8, utf8, utf8mb4

[16 Sep 2011 20:33] denixx baykin
Description:
Workbench deletes 4byte utf8 symbols without notification and inserts the string without 4byte symbols.
JDBC reports error when inserting.

How to repeat:
I'll attach file with instructions.
[16 Sep 2011 20:37] denixx baykin
Here is an instructions. Run one at a time.

Attachment: Workbench utf8 4byte.txt (text/plain), 781 bytes.

[16 Sep 2011 20:40] denixx baykin
I used mysql-5.5.15-win32.
[17 Sep 2011 7:34] Peter Laursen
you have utf8 as default charset for the database and create the table without engine specification. So the table/column will not handle 4 bit utf8 character.

But SQL_MODE has an influence here:

CREATE  TABLE `test`.`testutf8mb4` (
  `testValue` VARCHAR(50) NOT NULL ,
  PRIMARY KEY (`testValue`) );

USE `test`;

SET SQL_MODE = '';

INSERT INTO testutf8mb4 VALUES ('Сарнаут✔
[17 Sep 2011 7:44] denixx baykin
2Peter Laursen
Please, add a text file with instructions, this mysql bug tracker does not allow to store 4byte symbols in messages, but in files it can.
[17 Sep 2011 7:44] Peter Laursen
Haha! This bugs system runs on MySQL5.1 and we cannot write 4 byte utf8 characters here! They are not getting saved and the complete post truncates exactly like in denixx's example!

The point is SQL_MODE:

SET SQL_MODE = ''; INSERT .. << truncates
SET SQL_MODE = 'strict_all-tables'; INSERT .. << returs error.

Peter
(not a MySQL person)

BTW: what say about running this bugs system in strict mode instead?
[17 Sep 2011 7:45] Peter Laursen
@Denixx: I just found out the same.  See my last post! :-)
[17 Sep 2011 8:02] denixx baykin
Yes, with SET SQL_MODE = 'strict_all_tables'; it returns an error, thanks.
[17 Sep 2011 8:19] Valeriy Kravchuk
So, do you still think there is a bug here?
[17 Sep 2011 8:30] denixx baykin
I think it's something like unexpected feature but still need a fixing because there is a difference in work of Workbench and a Connector/J.
[17 Sep 2011 8:56] denixx baykin
Hm, now JDBC allows insertion of 4byte-symbols in table field but changes it to "?".
I slightly confused.
Maybe I  not tested it more than one time.
[17 Sep 2011 9:17] Peter Laursen
@Denixx: do you "SET NAMES utf8mb4"?  I think that both character_set_client and character_set_connection (at least) will have to be utf8mb4.

A private comment: originally in 5.5 early betas the old utf8 charset was renamed to utf8mb3 and utf8 was 4 byte. That was changed for some reason I don't know (maybe upgrade problems). But the first implemation of 4 byte utf8 was definitely the best.  What we have now is a mess.
[17 Sep 2011 9:20] denixx baykin
No, all is okay.
Just dropped the schema and restarted the server.
Then rolled the changes like in example and JDBC shows an exception again.