Bug #115925 Strings in subquery cause Illegal mix of collations
Submitted: 25 Aug 16:38 Modified: 26 Aug 6:22
Reporter: Alan Guo Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version: OS:Any
Assigned to: CPU Architecture:Any

[25 Aug 16:38] Alan Guo
Description:
When union to query, string in subquery may cause Illegal mix of collations

mysql> show variables like '%char%';
+-------------------------------------------------+--------------------------------+
| Variable_name                                   | Value                          |
+-------------------------------------------------+--------------------------------+
| character_set_client                            | utf8mb4                        |
| character_set_connection                        | utf8mb4                        |
| character_set_database                          | utf8mb4                        |
| character_set_filesystem                        | binary                         |
| character_set_results                           | utf8mb4                        |
| character_set_server                            | utf8mb4                        |
| character_set_system                            | utf8mb3                        |
| character_sets_dir                              | /usr/share/mysql-8.0/charsets/ |
| validate_password.changed_characters_percentage | 0                              |
| validate_password.special_char_count            | 1                              |
+-------------------------------------------------+--------------------------------+
10 rows in set (0.00 sec)

mysql> select cola from
    -> ((select cola from sub1)
    -> UNION
    -> (select '' cola where 1=0)
    -> ) as dt ;
Empty set (0.00 sec)

mysql> select cola from
    -> ((select cola from sub1)
    -> UNION
    -> (select cola from (select '' cola where 1=0) sub2
    -> )) as dt ;
ERROR 1267 (HY000): Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8mb4_0900_ai_ci,COERCIBLE) for operation 'UNION'

If a table column uses the utf8mb4 character set, that column will have the same behavior in regular queries and subqueries.

mysql> CREATE TABLE `s2` (
    ->   `cola` text CHARACTER SET utf8mb4
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin5;
Query OK, 0 rows affected (0.02 sec)

mysql> select cola from
    -> ((select cola from sub1)
    -> UNION
    -> (select cola from s2)
    -> ) as dt ;
Empty set (0.00 sec)

mysql> select cola from
    -> ((select cola from sub1)
    -> UNION
    -> (select cola from (select cola from s2) sub2)
    -> ) as dt ;
Empty set (0.00 sec)

How to repeat:
Client Character set: utf8mb4

CREATE TABLE `sub1` (
  `cola` text CHARACTER SET latin1 COLLATE latin1_swedish_ci
) ENGINE=InnoDB DEFAULT CHARSET=latin5;

following query can run successfully:
select cola from
((select cola from sub1)
UNION
(select '' cola where 1=0) 
) as dt ;

Following query will fail with  Illegal mix of collations

select cola from
((select cola from sub1)
UNION
(select cola from (select '' cola where 1=0) sub2
)) as dt ;

ERROR 1267 (HY000): Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8mb4_0900_ai_ci,COERCIBLE) for operation 'UNION'

In MySQL 5.7, both query can run successfully, In 8.0.39, 8.4.2, 9.0.1, the second qury will fail
[26 Aug 6:22] MySQL Verification Team
Hello Alan Guo,

Thank you for the report and test case.
Confirmed internally that this is duplicate of Bug #108627 - Literal value with different character set from derived table used in UNION: err (this is currently marked as private hence it is not accessible) which is fixed in upcoming MySQL 9.1.0.  

Quoting the change log for your reference:

Documented fix in the MySQL 9.1.0 changelog as follows:

    A character string literal value selected through a derived
    table and matched with a column with a different character set
    through a UNION operation sometimes raised
    ER_CANT_AGGREGATE_2COLLATIONS (Illegal mix of collations).

Closed.

regards,
Umesh