Bug #8337 data los cos by missing characters in hebrew character set
Submitted: 5 Feb 2005 22:45 Modified: 21 Feb 2005 12:52
Reporter: shimon doodkin Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:4.1.9 OS:Microsoft Windows (winxp)
Assigned to: Alexander Barkov CPU Architecture:Any

[5 Feb 2005 22:45] shimon doodkin
Description:
i got wrong resolts
i got query signs between letters instead of hebrew punctuation signs.
in hebrew.xml
only hebew letters without points and punctuation and accent signs:
#
Hebrew	אבגדהוזחטיךכלםמןנסעףפץצקרשת
##################################
from languages.html 
i understend that mabe other misstakes
because hebrew.xml is made by languages.html 
my understending come from that i guess that languages.html contanis not full unicode language blocks
and maybe fixed

How to repeat:
the string was

נָכוֹן אוֹ לֹא נָכוֹן - הַיְלָדִים הִסְתַכְּלוּ

select CONVERT( CONVERT(0xF03FEBE53FEF20E0E53F20EC3FE020F03FEBE53FEF202D20E43FE93FEC3FE33FE9ED20E43FF13FFA3FEB3F3FECE53F USING binary ) USING hebrew ) 

you will get query signs between letters

Suggested fix:
http://www.fileformat.info/info/unicode/block/hebrew/index.htm

add all characters that in unicode hebrew language block

or  
automate it some how and create correct xml character maps for all languages
[5 Feb 2005 22:50] shimon doodkin
bugreport in utf8

Attachment: hebrew in mysql unicode.txt (text/plain), 447 bytes.

[5 Feb 2005 22:52] shimon doodkin
bugreport in utf8 the correct one

Attachment: test_utf8.txt (text/plain), 1.01 KiB.

[16 Feb 2005 22:56] Michael Emeltchenkov
The same problem with Russian unicode encoding.
[21 Feb 2005 12:52] Alexander Barkov
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.mysql.com/documentation/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

Additional info:

In fact neither Hebrew (aka iso-8859-8), nor Win Hebrew (cp1256)
do not support Hebrew punctuation marks you're talking about.

You can use utf8 or ucs2 columns to be able to store and retrieve these marks.
[23 Feb 2005 18:34] Michael Emeltchenkov
Mmm, I noticed, that unicode character 'Ñ?' (russian char) displays as two '?' symbols. Maybe it's a PHP5 or Mozilla Firefox bug, not MySQL 4.1.0?
[24 Feb 2005 21:20] Michael Emeltchenkov
Just changed to utf8_unicode_ci and all is ok now.