Description:
"create view" and "create table" can derive column names from the value of the column;
create table t1 select 'A'; desc t1;
+-------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+-------+
| A | varchar(1) | NO | | | |
+-------+------------+------+-----+---------+-------+
create view v1 as select 'A'; desc v1;
+-------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+-------+
| A | varchar(1) | NO | | | |
+-------+------------+------+-----+---------+-------+
If the value is given in ucs2 ( _ucs2 0x0041 ), then the column ends up with an almost random name.
In fact the name seems to be created as follows:
- takee the ucs2 string as byte sequence
- remove leading 00 bytes (usually just the first byte)
- take the remining (shifted) string as valid ucs2, and convert to utf8
- the utf8 value is the derived column name
This deosnt appear very sensible?
In how to repeat the column name is displayed (on the client) as "?" or "??". Trying to "select `??` from v1" does of course not work
How to repeat:
set names latin1;
set character_set_connection = latin1;
drop view if exists v1; create view v1 as select _ucs2 0x004100420043;
select * from v1;
+-----+
| ?? |
+-----+
| ABC | # this is actually a ucs2 string, but it got converted to charset_client
+-----+
hexdump -C v1.frm
00000000 54 59 50 45 3d 56 49 45 57 0a 71 75 65 72 79 3d |TYPE=VIEW.query=|
00000010 73 65 6c 65 63 74 20 5f 75 63 73 32 5c 27 5c 5c |select _ucs2\'\\|
00000020 30 41 5c 5c 30 42 5c 5c 30 43 5c 27 20 41 53 20 |0A\\0B\\0C\' AS |
00000030 60 e4 84 80 e4 88 80 60 0a 6d 64 35 3d 32 63 39 |`......`.md5=2c9|
00000040 62 31 36 34 32 31 62 31 34 37 66 65 33 62 66 38 |b16421b147fe3bf8|
00000050 38 61 36 36 36 32 30 65 62 36 32 61 30 0a 75 70 |8a66620eb62a0.up|
00000060 64 61 74 61 62 6c 65 3d 30 0a 61 6c 67 6f 72 69 |datable=0.algori|
00000070 74 68 6d 3d 30 0a 64 65 66 69 6e 65 72 5f 75 73 |thm=0.definer_us|
00000080 65 72 3d 72 6f 6f 74 0a 64 65 66 69 6e 65 72 5f |er=root.definer_|
00000090 68 6f 73 74 3d 6c 6f 63 61 6c 68 6f 73 74 0a 73 |host=localhost.s|
000000a0 75 69 64 3d 32 0a 77 69 74 68 5f 63 68 65 63 6b |uid=2.with_check|
000000b0 5f 6f 70 74 69 6f 6e 3d 30 0a 72 65 76 69 73 69 |_option=0.revisi|
000000c0 6f 6e 3d 31 0a 74 69 6d 65 73 74 61 6d 70 3d 32 |on=1.timestamp=2|
000000d0 30 30 37 2d 30 37 2d 30 35 20 32 30 3a 31 35 3a |007-07-05 20:15:|
000000e0 30 34 0a 63 72 65 61 74 65 2d 76 65 72 73 69 6f |04.create-versio|
000000f0 6e 3d 31 0a 73 6f 75 72 63 65 3d 73 65 6c 65 63 |n=1.source=selec|
00000100 74 20 5f 75 63 73 32 20 30 78 30 30 34 31 30 30 |t _ucs2 0x004100|
00000110 34 32 30 30 34 33 0a |420043.|
At 00000031: e4 84 80 e4 88 80
select hex(convert( _utf8 0xe48480e48880 using ucs2));
+------------------------------------------------+
| hex(convert( _utf8 0xe48480e48880 using ucs2)) |
+------------------------------------------------+
| 41004200 |
+------------------------------------------------+
# the original ucs2 string, but the 1st 00 byte is missing
drop table if exists t1;
create table t1 select _ucs2 0x00410042;
select * from t1;
show create table t1;
hexdump -C t1.frm # only relevant line
00002150 04 00 04 e4 84 80 00 04 04 04 04 00 01 00 00 00 |................|
contains e4 84 80 as name:
select hex(convert( _utf8 0xe48480 using ucs2));
+------------------------------------------+
| hex(convert( _utf8 0xe48480 using ucs2)) |
+------------------------------------------+
| 4100 |
# agian, one leading zero missing, and then cut off at the end
Suggested fix:
if converting ucs2 int utf8 for a column name, do not drop the leading 00