Bug #62077 Assign date/time type to user-defined variable changes type/charset/collation
Submitted: 3 Aug 2011 19:50 Modified: 5 Aug 2011 16:18
Reporter: tom worster Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: General Severity:S3 (Non-critical)
Version:5.5.11, 5.5.16 OS:Any
Assigned to: CPU Architecture:Any
Tags: charset, collation, regression, type, user variable, user-defined variable

[3 Aug 2011 19:50] tom worster
Description:
Similar to Bug #54668 

Assigning a value that has date or time type to a user defined variable changes the type (charset, collation?) Before assignment the value has charset 'binary', after it has the locale's (server, connection?) charset and collation.

Server v5.1 does not change type, server v5.5 does. The regression causes some queries to fail with error 1267 "Illegal mix of collations".

How to repeat:
Run this test script:

set names utf8;
select version();
select @a:=time('1:11'), charset(time('1:11')), charset(@a), collation(@a);
select @a:=sec_to_time(1), charset(sec_to_time(1)), charset(@a), collation(@a);
select @a:=date('11-01-01'), charset(date('11-01-01')), charset(@a), collation(@a);
select @a:=from_unixtime(1), charset(from_unixtime(1)), charset(@a), collation(@a);
select @a:=time('1:11'), @a < time('2:22');

Compare behavior between server versions 5.1 and 5.5.
[3 Aug 2011 19:55] tom worster
A test script to try with server versions 5.1 and 5.5

Attachment: test_script.sql (application/octet-stream, text), 402 bytes.

[3 Aug 2011 19:56] tom worster
Demonstration of the bug

Attachment: test_script_output.txt (text/plain), 4.15 KiB.

[4 Aug 2011 3:54] Valeriy Kravchuk
This is easy to verify:

macbook-pro:5.5 openxs$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.5.16-debug Source distribution

Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> set names utf8;
Query OK, 0 rows affected (0.00 sec)

mysql> select version();
+--------------+
| version()    |
+--------------+
| 5.5.16-debug |
+--------------+
1 row in set (0.00 sec)

mysql> select @a:=time('1:11'), charset(time('1:11')), charset(@a), collation(@a);
+------------------+-----------------------+-------------+-----------------+
| @a:=time('1:11') | charset(time('1:11')) | charset(@a) | collation(@a)   |
+------------------+-----------------------+-------------+-----------------+
| 01:11:00         | binary                | utf8        | utf8_general_ci |
+------------------+-----------------------+-------------+-----------------+
1 row in set (0.00 sec)

mysql> select @a:=sec_to_time(1), charset(sec_to_time(1)), charset(@a), collation(@a);
+--------------------+-------------------------+-------------+-----------------+
| @a:=sec_to_time(1) | charset(sec_to_time(1)) | charset(@a) | collation(@a)   |
+--------------------+-------------------------+-------------+-----------------+
| 00:00:01           | binary                  | utf8        | utf8_general_ci |
+--------------------+-------------------------+-------------+-----------------+
1 row in set (0.00 sec)

mysql> select @a:=date('11-01-01'), charset(date('11-01-01')), charset(@a), collation(@a);
+----------------------+---------------------------+-------------+-----------------+
| @a:=date('11-01-01') | charset(date('11-01-01')) | charset(@a) | collation(@a)   |
+----------------------+---------------------------+-------------+-----------------+
| 2011-01-01           | binary                    | utf8        | utf8_general_ci |
+----------------------+---------------------------+-------------+-----------------+
1 row in set (0.00 sec)

mysql> select @a:=from_unixtime(1), charset(from_unixtime(1)), charset(@a), collation(@a);
+----------------------+---------------------------+-------------+-----------------+
| @a:=from_unixtime(1) | charset(from_unixtime(1)) | charset(@a) | collation(@a)   |
+----------------------+---------------------------+-------------+-----------------+
| 1970-01-01 03:00:01  | binary                    | utf8        | utf8_general_ci |
+----------------------+---------------------------+-------------+-----------------+
1 row in set (0.01 sec)

Even though in 5.1 we will see binary everywhere, this was an intended change in behavior at early stage of 5.5 to use connection charset in these cases (see that old bug report), so no bug here yet. 

But the following:

mysql> select @a:=time('1:11'), @a < time('2:22');
ERROR 1267 (HY000): Illegal mix of collations (utf8_general_ci,IMPLICIT) and (latin1_swedish_ci,NUMERIC) for operation '<'
mysql> show variables like 'char%';
+--------------------------+---------------------------------------+
| Variable_name            | Value                                 |
+--------------------------+---------------------------------------+
| character_set_client     | utf8                                  |
| character_set_connection | utf8                                  |
| character_set_database   | latin1                                |
| character_set_filesystem | binary                                |
| character_set_results    | utf8                                  |
| character_set_server     | latin1                                |
| character_set_system     | utf8                                  |
| character_sets_dir       | /Users/openxs/dbs/5.5/share/charsets/ |
+--------------------------+---------------------------------------+
8 rows in set (0.00 sec)

looks like a bug, as I do not see any clear reason why latin1 was used. One would expect utf8 or binary, but not latin1.
[4 Aug 2011 11:59] tom worster
Hi Valeriy Kravchuk, when you said: "this was an intended change in behavior at early stage of 5.5 to use connection charset in these cases (see that old bug report)" could you point me to that bug?

Should I conclude that I need to rewrite all queries that compare a date, time or datetime-typed valued with a user-defined variable? Specifically, do I need to wrap a conversion function around each read of a user-defined variable?
[4 Aug 2011 12:34] Valeriy Kravchuk
I mean http://forge.mysql.com/worklog/task.php?id=2649 mentioned in the bug you mentioned, bug #54668. There I read:

"So let us extend the principle as follows:
If datetime-to-string conversion causes VARBINARY result,
and numeric-to-string conversion would cause VARBINARY result,
and non-conversion would not cause VARBINARY result,
and the function is not listed among "Exceptions" above,
then result should have character_set_connection, collation_connection.

This will not cause difficulty because the characters that might
appear in DATE or TIME or DATETIME or TIMESTAMP values, i.e.
- / . : + 0-9 A-Z, are in all character sets. The assumption
is that date values will never contain a (possibly Japanese)
month name."

I think the above should apply to your case, and, thus, I consider latin1 example a bug in 5.5.
[4 Aug 2011 13:14] tom worster
Thank you! To check my understanding: the new (5.5) expected behavior is that a DATETIME (etc.) value T is converted to a string with charset_connection and collation_connection when:

(1) T is assigned to a user-defined variable @a,

(2) T is evaluated in the context of comparison with a string.

So the comparison @a < time('2:22') should be ok (because of the nature of date and time strings in MySQL).

What's not clear to me is if, in evaluating @a < time('2:22'), the problem is on the left or on the right of the <. In other words, is 5.5 incorrectly performing (2) and converting time('2:22') to latin1? or is it changing @a's charset when it evaluates @a in the context of the comparison?
[5 Aug 2011 16:18] tom worster
Additional evidence possibly related:

FAILS with ERROR 1267:
select @a := time('2:22'), @a < time('2:23');
select @a := timediff(2,1), @a < time('2:23');
select @a := sec_to_time(123), @a < time('2:34');
select adddate(1, 1) < date('12-1-3'); -- should be NULL 
select date_add(1, 1) < date('12-1-3'); -- should be NULL 
select date_add(1, interval 1 day) < date('12-1-3'); -- should be NULL
select date_sub(1, interval 1 day) < date('12-1-3'); -- should be NULL

WORKS as expected:
select @a := date('12-1-2'), @a < date('12-1-3');
select @a := timestamp('12-1-2'), @a < timestamp('12-1-3');
select @a := curdate(), @a < date('12-1-3');
select @a := from_unixtime(0), @a < timestamp('1970-01-02');
select timestamp(@x) < timestamp('12-01-03');
select timestamp(1) < timestamp('12-01-03');
select addtime('', 1) < time('2:22');
select timediff(2,1) < time('2:23');
select sec_to_time(123) < time('2:34');

The pattern seems to be:

Times assigned to a user-var that is then then used in a comparison.

Date functions evaluating to NULL in a comparison.
[20 Apr 2012 0:55] Mark Raichuk
Also, if you execute the following statements, the character set and collation are different depending on how the user defined variable is set (latin1 if you use "SET @a..." and utf8 if you use "SELECT @a:=..."):

mysql> SET NAMES utf8;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 5.5.23    |
+-----------+
1 row in set (0.00 sec)

mysql> SET @a = DATE('11-01-01');
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT CHARSET(DATE('11-01-01')), CHARSET(@a), COLLATION(@a);
+---------------------------+-------------+-------------------+
| CHARSET(DATE('11-01-01')) | CHARSET(@a) | COLLATION(@a)     |
+---------------------------+-------------+-------------------+
| binary                    | latin1      | latin1_swedish_ci |
+---------------------------+-------------+-------------------+
1 row in set (0.00 sec)

mysql> SELECT @a:=DATE('11-01-01'), CHARSET(DATE('11-01-01')), CHARSET(@a), COLLATION(@a);
+----------------------+---------------------------+-------------+-----------------+
| @a:=DATE('11-01-01') | CHARSET(DATE('11-01-01')) | CHARSET(@a) | COLLATION(@a)   |
+----------------------+---------------------------+-------------+-----------------+
| 2011-01-01           | binary                    | utf8        | utf8_general_ci |
+----------------------+---------------------------+-------------+-----------------+
1 row in set (0.00 sec)

Has this been confirmed to be a bug?
[28 May 2014 9:24] Erlend Dahl
Bug#72753 mysql 5.6.17 user defined variables charset problem

was marked as a duplicate.