Bug #21599 Suggestion to add ISO-8859-1 charset
Submitted: 11 Aug 2006 22:08 Modified: 28 Aug 2006 11:27
Reporter: Martin Stjernholm Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Charsets Severity:S4 (Feature request)
Version:5.1 OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Tags: character set

[11 Aug 2006 22:08] Martin Stjernholm
Description:
When sending queries to a MySQL server from a "unicode environment", it'd be possible to do an interesting optimization if the charset ISO-8859-1 was supported.

With "unicode environment", I mean an environment where unicode strings are stored as either eight bit strings (if they contain no code point above U+00FF) or sixteen bit UTF-16 strings (otherwise), and the choice of representation is made automatically and transparently to the user.

A user in such an environment would typically use the charset utf8 as much as possible in the MySQL server.

In this case, one could avoid the overhead to UTF-8 encode SQL queries that are eight bit strings (which is the vast majority of all queries): Instead of always using character_set_client=utf8, the MySQL interface glue in this environment could quickly detect that the query is a narrow string, do "SET character_set_client=iso88591", and send the query as-is. If a wide string query is encountered later, it'd automatically switch to utf8 again and UTF-8 encode that query.

This approach works almost but not quite with the latin1 character set since it deviates from a straight one-to-one unicode mapping in the range 0x80..0x9f. Having to search through all narrow string queries for these fairly uncommon characters makes this optimization significantly less attractive.

How to repeat:
n/a

Suggested fix:
Please add an eight bit character set that is a one-to-one mapping to Unicode in the range U+0000..U+00FF.
[28 Aug 2006 11:27] Valeriy Kravchuk
Thank you for a reasonable feature request.