Bug #21085 database collate set overriden by specifying only character set on column
Submitted: 16 Jul 2006 21:33 Modified: 27 Jul 2006 6:14
Reporter: john danilson Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:5.0.25_BK, 5.0.21 OS:Linux (Red Hat Enterprise Linux ES rele)
Assigned to: Alexander Barkov CPU Architecture:Any
Tags: collate, duplicate, insert

[16 Jul 2006 21:33] john danilson
Description:
getting duplicate key on insert of value in case sensitive column for values a and A.  Duplicate should not occur.  The problem seems to be that the collate set is being overriden if character set is specified for the column.  

The database character set and collate are defined as utf8 and utf8_bin as shown from the create database statement:

mysql> show create database community;
+-----------+-------------------------------------------------------------------------------------+
| Database  | Create Database                                                                     |
+-----------+-------------------------------------------------------------------------------------+
| community | CREATE DATABASE `community` /*!40100 DEFAULT CHARACTER SET utf8 COLLATE utf8_bin */ | 
+-----------+-------------------------------------------------------------------------------------+

Given this, I would assert that any column would default to collate utf8_bin, which is not the case, iff a collation set is given.  Here are four cases below:

# first case. default db character set and default collate. 
drop table if exists foo;
create table foo (a char(10) not null,
primary key (a));
insert foo values ('a');
insert foo values ('A');

# specified character set and collation
drop table if exists foo;
create table foo (a char(10) character set utf8 collate utf8_bin not null,
primary key (a));
insert foo values ('a');
insert foo values ('A');

# only specify collate.
drop table if exists foo;
create table foo (a char(10) collate utf8_bin not null,
primary key (a));
insert foo values ('a');
insert foo values ('A');

# only specify character set. second insert fails duplicate key
drop table if exists foo;
create table foo (a char(10) character set utf8 not null,
primary key (a));
insert foo values ('a');
insert foo values ('A');

All of the inserts succeed as expected except the last one where only a character set is specified and I would assume, the collate defaults.  However it must be defaulitng to something else. 

This might be similiar to or a sub set of bug 4959.

How to repeat:
# first case. default db character set and default collate. 
drop table if exists foo;
create table foo (a char(10) not null,
primary key (a));
insert foo values ('a');
insert foo values ('A');

# specified character set and collation
drop table if exists foo;
create table foo (a char(10) character set utf8 collate utf8_bin not null,
primary key (a));
insert foo values ('a');
insert foo values ('A');

# only specify collate.
drop table if exists foo;
create table foo (a char(10) collate utf8_bin not null,
primary key (a));
insert foo values ('a');
insert foo values ('A');

# only specify character set. second insert fails duplicate key
drop table if exists foo;
create table foo (a char(10) character set utf8 not null,
primary key (a));
insert foo values ('a');
insert foo values ('A');

Suggested fix:
if neither character set nor collate are defined; default to db setting.
if character set is defined alone, default to collate from the db.
if collare is specified, default to db character set.
if both are defined, use them.
[17 Jul 2006 11:45] Valeriy Kravchuk
Thank you for a bug report. Verified just as described with 5.0.25-BK on Linux:

openxs@suse:~/dbs/5.0> bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1 to server version: 5.0.25

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> create database community default character set utf8 collate utf8_bin;
Query OK, 1 row affected (0.00 sec)

mysql> use community;
Database changed
mysql> drop table if exists foo;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> create table foo (a char(10) not null,
    -> primary key (a));
 vQuery OK, 0 rows affected (0.02 sec)

amysql> insert foo values ('a');
Query OK, 1 row affected (0.00 sec)

imysql> insert foo values ('A');
Query OK, 1 row affected (0.00 sec)

mysql> show create table foo\G
*************************** 1. row ***************************
       Table: foo
Create Table: CREATE TABLE `foo` (
  `a` char(10) collate utf8_bin NOT NULL,
  PRIMARY KEY  (`a`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin
1 row in set (0.00 sec)

mysql> drop table if exists foo;
cQuery OK, 0 rows affected (0.00 sec)

mysql> create table foo (a char(10) character set utf8 not null,
    -> primary key (a));
luQuery OK, 0 rows affected (0.02 sec)

mysql> insert foo values ('a');
Query OK, 1 row affected (0.00 sec)

imysql> insert foo values ('A');
ERROR 1062 (23000): Duplicate entry 'A' for key 1
mysql> show create table foo\G
*************************** 1. row ***************************
       Table: foo
Create Table: CREATE TABLE `foo` (
  `a` char(10) character set utf8 NOT NULL,
  PRIMARY KEY  (`a`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin
1 row in set (0.12 sec)
[27 Jul 2006 6:14] Alexander Barkov
I agree with Peter. This is expected behaviour,
written in the manual.
[3 Mar 2009 6:38] Susanne Ebrecht
Bug #43307 is a duplicate of this bug here.