MySQL Bugs: #41044: Problem With Greek Collation

Bug #41044	Problem With Greek Collation
Submitted:	26 Nov 2008 8:21	Modified:	28 Nov 2008 14:04
Reporter:	Alexander K	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	Connector / ODBC	Severity:	S2 (Serious)
Version:	5.1.5	OS:	Windows
Assigned to:		CPU Architecture:	Any
Tags:	data, Greek, ODBC

Description:
When trying to download data using ODBC connector 5.1.5 from MySQL server, the application (used to draw the data), crushes.

The collation of the field trying to download, must be any Greek.

Workaround: Use UTF8(Unicode) instead.

Drawback: Too slow when you have large databases with lots of records.

How to repeat:
Create a Table and add a field with Greek Collation. Add record (one is enough). Use an application to try and draw data (using a simple Select statement), through the ODBC connector.

Suggested fix:
On one of my computers, it simply works fine with no bugs. This computer is mine (programmer's) and I was not able to find out what my computer has that others don't. My computer is heavily loaded, 'cause I use it to test everything before send it to production. All updates and much more are installed. Also I went through all the MySQL servers and connectors.

Try to check the ODBC and how it handles the Greek Collation

Many thanks for writing a bug report.

We don't have a separate Greek collation so that I think that I misunderstood you.

For a deeper analysation we need the following from you:

1) Which MySQL server version do you use? (output from: SELECT VERSION())
2) Output from SHOW CREATE TABLE <your table>;
3) Output from SHOW VARIABLES LIKE '%char%';
4) Output from SHOW VARIABLES LIKE '%coll%';

5) From you Windows command output from: CHCP
This will show you the codepage that your Windows is using

6) How did you use ODBC? I mean: Did you use it for a programming language or did you use MS Access or something else?

Sorry, I was blind ... we have a Greek collation.

But that doesn't matter. For deeper analysation we need answers of all the questions before.

Hello Susanne...

I have the answers you asked for:

1) I'm using 2 different servers a) MySQL server version: 5.0.27 b) 5.0.51a-3ubuntu5.1-log (same problem on both)

2) CREATE TABLE `Test1` (
  `ID` int(11) NOT NULL default '0',
  `TypeChar` varchar(20) character set greek default NULL,
  PRIMARY KEY  (`ID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

(while trying to find what was going on I re-created the table using UTF8 Charset but when a field with Greek is found, it hangs. If UTF8 is used, its ok)

3) character_set_client	utf8
character_set_connection	utf8
character_set_database	latin1
character_set_filesystem	binary
character_set_results	utf8
character_set_server	latin1
character_set_system	utf8
character_sets_dir	/usr/share/mysql/charsets/

4)collation_connection	utf8_general_ci
collation_database	latin1_swedish_ci
collation_server	latin1_swedish_ci

5) 737

6) First I noticed it in MS Access. Then I tried from Python (through spss) (Both crushed). I used different DSN and connections, but in vain.

Hello Alexander,

you have a real nice mix there. This doesn't look like a bug.

"`TypeChar` varchar(20) character set greek default NULL,"

Here you define a varchar column with character set greek. The character set greek is ISO 8859-7 Greek. This means it is a 1 byte character set. The matching collation would be greek_general_ci.

You didn't set a collation here, so the system tries to use default collation:

") ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci"

utf8_unicode_ci is the matching collation for character set utf8.

The combination that you use: Character set greek and collation utf8_unicode_ci is not matching.

For resolve this problem you have two different possibilities:
1) Add greek collation to the column:

`TypeChar` varchar(20) character set greek collate greek_general_ci default NULL,

2) Change charset of column to utf8:

`TypeChar` varchar(20) character set utf8 collate utf8_unicode_ci default NULL,

By looking into this row:
ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

You see that utf8 settings are already the default settings so it is enough to say nothing at the column directly:

`TypeChar` varchar(20) default NULL,

For Greek I would recommend that you set all variables to utf8 just to not confuse the system with latin1.

1) character_set_database, collation_database
These are set during creating the database.
Here it is better when you say: CREATE DATABASE <yourdb> CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Or when you want to modify it:
ALTER DATABASE <db_name> CHARACTER SET utf8 COLLATE utf8_unicode_ci;

2) character_set_server, collation_server
here the best would be to define it in my.ini to utf8 and utf8_unicode_ci

Hello again Susanne 

I recreated the database and the table, using your guidance...
The new table is shown below:

SHOW CREATE TABLE Test1;

CREATE TABLE `Test1` (
  `ID` int(11) default NULL,
  `TypeChar` varchar(20) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=greek

I tried this procedure again, before contacting you. I know that the old table was a mess, because I was trying to find out what the problem was... and I messed it up (I made a notice on it). The problem is... that even with a newly (Hopefully created in the right way) database and the new table, the crushing still occurs. I hope I'm not wasting your time (Maybe it's something I did).

Thank you again...

Alexander

Hi Alexander,

CREATE TABLE `Test1` (
  `ID` int(11) default NULL,
  `TypeChar` varchar(20) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=greek

What collation is used here?

Because non collation is extra given here, system will use collation of the database.

collation_database	latin1_swedish_ci

That won't match in any case ... Greek is too different from Swedish.

You should add in your create statement:

) ENGINE=MyISAM DEFAULT CHARSET=greek COLLATE greek_general_ci;

This is the "create" statement I used...

CREATE TABLE `Test1` (
  `ID` int(11) default NULL,
  `TypeChar` varchar(20) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=greek COLLATE greek_general_ci;

Also... is it possible that using Toad might have something to do with it???