Bug #27822 Wrong Character Position Calculation in Query Area
Submitted: 14 Apr 2007 14:42 Modified: 21 Feb 2009 4:35
Reporter: Sam Fur Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Query Browser Severity:S2 (Serious)
Version:1.2.11 OS:Windows (WinXP Pro SP2)
Assigned to: Mike Lischke CPU Architecture:Any
Tags: corruption, Cursor Position, Edit, Multi-byte Character, Mysql Query Browser, Wrong Highlight

[14 Apr 2007 14:42] Sam Fur
Description:
Query Browser calculates character position only on byte count basis in Query Area. When multi-byte characters are included in SQL statement, character position may not match the byte count from the start of statement. This causes following problems:

1 : Corrupted character display
    Characters, particularly multi-byte ones, are corrupted in their display.

2 : Wrong highlight of SQL key words
    SQL keywords such as "from", "where", "order" etc are not highlighted 
    properly after multi-byte characters. See How to repeat.

3 : Data Corruption while editing SQL Statement
    Cursor may be placed in the middle of multi-byte character. Ensuing data
    edition corrupts SQL statement.

Cursor move by arrow-keys, backspace key and delete key easily reproduces 
problems. Cut-and-paste of multi-byte character, as well as key board input 
of DBCS causes problem.

Note: Problem occurs only in Query Area. Multi-byte characters are properly 
input/displayed/edited in Resultset Area (Edit) or Object Browser (Search). 

How to repeat:

1 : Enter "select abcd as 壱 from tab_xyz order by qwert ;" in Query Area.
    壱 is a double-byte character(character_set sjis).
    The highlighted keywords are shifted to right by one byte ;"rom", "rder"
    and "y". ("from" falls on 18th character position, while it is on 19th
    byte position.)

2 : Enter "select abcd as 壱壱 from tab_xyz order by qwert ;"
    The highlighted parts are shifted to right by two bytes ; "om t", "der b"
    and "q".

Suggested fix:
1: Correction on cursor position calculation.

2: Input control of Query Area.
   This is a possibility. It seems input control used for Query Area 
   disregards the invocation of language processor(input of multi-byte
   characters) and it remains in single byte input mode. The language bar 
   of WinXP shows no sign of DBCS mode even when Japanese characters are 
   placed in Query Area. The bar shows DBCS mode when Resultset Area is 
   edited. This theory (disregard of multi-byte mode) also explains the 
   problems written above.
[16 Apr 2007 14:01] MySQL Verification Team
Thank you for the bug report. I was able to repeat on Vista the highlighted
from keyword issue.
[9 Oct 2008 19:59] Shabeeb Rizvi
Has this been fixed or do we have any way to fix this the solution mentioned bounced over my head.... Can anyone explain me in english what has to be done as to fix the problem practically.
[10 Oct 2008 4:54] Sam Fur
Nice to hear guys talking about this bug.
Hope bug status turns to "In-progress" after 
one year and a half of hybernation. :)

Simply put, internationalize Query Area to fix 
the bug. What to do:
i) Support of MBCS in Query Area
MBCS(e.g. UTF-8,Shift-JIS) must be entered, 
displayed, and edited there.
ii) Front-end Processor
Allow front-end processor,such as MS IME, to
kick-in for MBCS input.

For MBCS handling, see source code of Resultset 
Area where these char set are properly handled.

Sam
[16 Feb 2009 10:43] Mike Lischke
This issue is just a cosmetic one, even though annoying. Even though I don't see the behavior described here with the given sample text I know that the syntax highlighter used in the editor does not support Unicode (one of the very few pieces in QB which does not). However since the editor does support Unicode, your query will be correctly sent to the server, only the display is irritating. This issue will be fixd in MySQL Workbench 5.2, where we will include the most important features of the QB.
[17 Feb 2009 16:20] Sam Fur
Mike, 

It's really good to hear that the bug is going to get fixed.
At the same time, it's sad to know that the bug-fix is made 
only on Workbench not on QB which is sidelined from the main
stream of MySQL product line-up. 

But for the correct bug fix, allow me to remind that the bug
is not just a cosmetic one but it DOES corrupt SQL statement.

Here's a reproduction of corrupting case.

Select address as 住所, unitprice as 地価 from lptbl
where address like '%世田谷%'
order by unitprice desc limit 10;

Above statement in the Query Area works although it looks 
cosmetically ugly. Some part of the statement is invisible.

Left-click on "from" on the top line. And press down-arrow key
to move cursor down to the bottom line.

The cursor is placed somewhere in "limit," not at the end of the
statement. 

Press left-arrow key further to place cursor on the right of
"unitprice." Repeat BS key to erase "unitprice." The cursor 
should be between "order by" and "desc." Enter alias 地価 there.

The alias is, however, placed in the wrong place. The last line 
should look like this:

order by  desc lim地価it 10;

Corruption is easily noticeable this time since it happened in the 
visible part of the statement. Corruption occurs sometimes in
the invisible part especially in the long SQL statement. Then 
there is no way to know what happend and where.

Running environments
Windows Vista 32bit Home Premium 
MS IME (Japanese)
QueryBrowser : Version 1.2.13

Sam
[20 Feb 2009 14:58] Mike Lischke
Hey Sam,

that should definitely not happen. Please try the latest version whether you still see this behavior. The syntax highlighter should not cut any text or such.
[20 Feb 2009 16:28] Sam Fur
Mike,

Your quick response is appreciated.

Reproduction steps are tried in two cases below:
QB version 1.2.14 on WinXP Home SP3
QB version 1.2.16 on WinVista Home Premium 32bit

Wrong chars are highlighted and 地価 butts into
"limit" in both cases.

Is 1.2.16 not the latest version?

Sam
[20 Feb 2009 17:02] Mike Lischke
Query Area in MySQL Query Browser with chinese characters.

Attachment: Chinese chars in QB.png (image/png, text), 32.14 KiB.

[20 Feb 2009 17:06] Mike Lischke
Sam, I wonder why you see that. I cannot reproduce what you wrote on my Vista 64 business installation. Please see the image I uploaded. Ignoring the a bit packed full-width chars in the LIKE string, it looks ok to me. This is 1.2.16.

But anyway, this will definitely work properly in WB once it is finished there.

Mike
[21 Feb 2009 4:35] Sam Fur
Mike,

Thanks for the screen shot which tells why the bug is not reproduced.

Replace "unitprice" with "地価" only in the bottom line. When it is replaced with "alias 地価", these two words stand right there as shown in your screen shot.

Sorry if my reproducing steps are misleading.

I hope everything works fine in WB.

Sam