Bug #58521 'current charset' is unclear
Submitted: 26 Nov 2010 14:12 Modified: 1 Jul 2011 18:40
Reporter: Peter Laursen (Basic Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:any OS:Any
Assigned to: Paul DuBois CPU Architecture:Any
Tags: qc

[26 Nov 2010 14:12] Peter Laursen
Description:
http://dev.mysql.com/doc/refman/5.1/en/identifiers.html

"The set of alphanumeric characters from the current character set ... are not special.

There is not a single current character set there are many (character_set_database, character_set_client etc, character used for (column storage etc)

How to repeat:
see above

Suggested fix:
Document *which* current character set matters in this context.  This is not updated 5.x documentation.
[26 Nov 2010 14:16] Valeriy Kravchuk
Thank you for the documentation request.
[26 Nov 2010 18:02] Peter Laursen
I suggest you do a global search in the doc for 'current charset'.  I am almost sure I reported same with another page earlier.

In this particular case I'd expect that it is character_set_connection that matters - but not perfectly sure!
[26 Nov 2010 21:53] Peter Laursen
hmmm... it does not seem to be character_set_connection that matters here!

SET NAMES utf8;
USE test; 

-- works
DROP TABLE IF EXISTS æøå;
CREATE TABLE æøå (id INT);
-- 

-- works
SET CHARACTER_SET_CONNECTION = latin1;
DROP TABLE IF EXISTS æøå;
CREATE TABLE æøå (id INT);
--

-- works
SET CHARACTER_SET_CONNECTION = latin2;
DROP TABLE IF EXISTS æøå;
CREATE TABLE æøå (id INT);
--

according to docs 'æøå' are special characters in latin2 (as latin2 does not have 'æøå'). But it works without `backticking`.

This is not an academic exercise.  On the opposite it is highly practical.  I/we am working on an application detail and we want to use `backticks` where they are necessary and also want to avoid them where they are not necessary.
[6 Dec 2010 13:45] Alexander Barkov
> The set of alphanumeric characters from the current character set,
> “_”, and “$” are not special.

This is a reminder from 4.0 times ^^^, when we had a single global character set.

Starting from 4.1 we convert identifiers to Unicode.
Starting from 5.1 we additionally allow characters '~', '/', '\\' in table
names.

As of version 5.1 and higher, it works as follows:

1. In quoted identifiers we accept full Unicode BMP range except U+0000.
- ASCII:    U+0001 .. 0x007F 
- Extended: U+0080 .. U+FFFF

Note, we don't allow trailing spaces:
create  table `t1 ` (a int);
ERROR 1103 (42000): Incorrect table name 't1 '

2. In non-quoted identifiers we accept:
- ASCII: [0-9,a-z,A-Z$_] -- Basic Latin letters, digits 0-9, dollar, underscore.
- Extended: U+0080 .. U+FFFF

We don't allow supplementary characters U+10000 and higher
neither in quoted, nor in non-quoted identifiers.
[1 Jul 2011 18:40] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

Updated text:

An identifier may be quoted or unquoted. If an identifier contains
special characters or is a reserved word, you must quote it whenever
you refer to it. (Exception: A reserved word that follows a period in
a qualified name must be an identifier, so it need not be quoted.)
Reserved words are listed at Section 8.3, “Reserved Words”.

Identifiers are converted to Unicode internally. They may contain
these characters:

* Permitted characters in unquoted identifiers:

  * ASCII: [0-9,a-z,A-Z$_] (basic Latin letters, digits 0-9, dollar,
    underscore)
  * Extended: U+0080 .. U+FFFF

* Permitted characters in quoted identifiers include the full Unicode
  Basic Multilingual Plane (BMP), except U+0000:

  * ASCII: U+0001 .. U+007F
  * Extended: U+0080 .. U+FFFF

* ASCII NUL (U+0000) and supplementary characters (U+10000 and higher)
  are not permitted in quoted or unquoted identifiers.

* Identifiers may begin with a digit but unless quoted may not consist
  solely of digits.

* Database, table, and column names cannot end with space characters.

* Before MySQL 5.1.6, database and table names cannot contain "/", "\",
  ".", or characters that are not permitted in file names.