Bug #21290 Alter database charset ... failed
Submitted: 26 Jul 2006 7:51 Modified: 24 Aug 2006 17:10
Reporter: KimSeong Loh (Candidate Quality Contributor) Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Parser Severity:S3 (Non-critical)
Version:4.1 and 5.0 OS:Windows (windows)
Assigned to: Marc ALFF CPU Architecture:Any
Tags: alter database, charset

[26 Jul 2006 7:51] KimSeong Loh
Description:
Alter database charset utf8  failed while
Alter database character set utf8   works fine

The reason is that the database name is not provided and it will use alter the default database, and charset is not a reserved word, so it is treated as the database name instead of the keyword charset to change the character set.

How to repeat:
mysql> use test;
Database changed
mysql> alter database charset utf8;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server version for the right syntax to use near 'utf8'
 at line 1

mysql> alter database character set utf8;
Query OK, 1 row affected (0.00 sec)

mysql> alter database test charset utf8;
Query OK, 1 row affected (0.00 sec)
[26 Jul 2006 12:45] MySQL Verification Team
Thank you for the bug report. In the chapter:

http://dev.mysql.com/doc/refman/5.0/en/alter-database.html

stands that the alter_specification is:

[DEFAULT] CHARACTER SET charset_name

and the database name can be omitted, but the syntax:

alter database charset utf8;

isn't mentioned there and in this case the database name should be
used. Changing to Documentation bug.
[26 Jul 2006 13:43] Stefan Hinz
I cannot see why this should be a bug. "ALTER DATABASE <dbname> UTF8" is wrong syntax, no matter what <dbname> is used, or if it's omitted. "ALTER DATABASE charset CHARACTER SET UTF8" works, just as written in the docs. Closing.
[26 Jul 2006 14:04] KimSeong Loh
Note that 
ALTER DATABASE <dbname> CHARSET utf8   works fine, so I assume CHARSET can be used as a synonym for CHARACTER SET, even though it is not documented explicitly. The synonym is documented in the CREATE TABLE syntax.

http://dev.mysql.com/doc/refman/5.0/en/alter-database.html
And the doc mention that <dbname> can be omitted and it will change the default database, therefore,
ALTER DATABASE CHARACTER SET utf8   works and change the current database
However,
ALTER DATABASE CHARSET utf8    does not work, and treats CHARSET as the database name. So, the synonym does not work when the <dbname> is omitted. My guess is that CHARSET is not a reserved word, therefore it is treated as the database name instead of a synonym to CHARACTER SET, while CHARACTER is a reserved word.
[27 Jul 2006 8:00] Stefan Hinz
Sorry for closing this bug in the first place. Yes, it's a bug, because CHARSET is a synonynom for CHARACTER SET. But it's not a documentation bug, but rather a parser bug. Changing back to Open, and changing the category to Server:Parser.
[24 Aug 2006 17:10] Marc ALFF
After investigating further, please find here the result of the analysis. 

'CHARACTER' is a SQL reserved keyword.
'SET' is a SQL reserved keyword.
The official SQL syntax for character sets is 'CHARACTER' 'SET'.

MySQL also uses an alternate syntax, by introducing the 'CHARSET' keyword.
However, 'CHARSET' is *not* a *reserved* keyword,
meaning it can also be used to name objects (databases, tables, etc).

In *most* but not *all* cases, MySQL accepts the 'CHARSET' notation in place of
'CHARACTER' 'SET', as this has some limitations.

The limitations are apparent when 'CHARSET' can be interpreted
as the name of an object, which is what the parser does when reading :

alter database charset ...

While naming a database 'CHARSET' is certainly not recommended,
it's legal, and the parser is making the correct choice here.

One possible way to force the parser to accept the syntax reported
would be to make 'CHARSET' a *reserved* keyword.

While the syntax "alter database charset utf8" would be accepted,
making such a change has some severe consequences :

1)
Every user code that currently uses 'charset' as a database name,
table name, variable name, etc will be broken by the change.

For example, the following code is (not recommended but) legal,
and would be broken :

create procedure p()
begin
  declare charset int;
  select 1 into charset;
end//
 
2)
The 'SHOW CHARACTER SET' command, which displays the content
of INFORMATION_SCHEMA.CHARACTER_SETS, would be affected.

Currently, the syntax :

SHOW CHARACTER SET WHERE charset like 'latin1%';

is accepted, since charset is an alias for the column
INFORMATION_SCHEMA.CHARACTER_SET_NAME

Making charset a reserved keyword would have some impact on the information schema itself.

---

After careful consideration of the issue reported, the conclusion is as follows:

The current behavior of the code for ALTER DATABASE is correct, and as designed.

The current documentation for ALTER DATABASE is correct in not documenting the
alternate charset syntax, as this syntax is not fully supported :
- alter database charset utf8 : not supported, as reported
- alter database default charset utf8 : supported
- alter database <name> charset utf8 : supported
- alter database <name> default charset utf8 : supported

Since the documentation does not document the ALTER DATABASE charset syntax,
there is no expectation that this syntax should work.

While I totally agree that the alternate 'CHARSET' syntax is not a replacement
for the 'CHARACTER' 'SET' for *every* command,
which makes the use of 'CHARSET' versus 'CHARACTER' 'SET'
not completely orthogonal to the command used as it should be,
I believe the current design and implementation to be the most prudent choice,
as it prevents to break legal user code.

The current behavior is correct in that sense and should be preserved,
even at the cost of a slight orthogonality irregularity with convenience syntaxes.

As a result, this report is now closed as 'Not a bug'.
Thanks again for documenting it.

Regards,
Marc Alff.