Bug #31744 UNION and implicit CAST causes character conversion problem
Submitted: 22 Oct 2007 9:17 Modified: 19 Dec 2007 13:39
Reporter: Martin Kirchner Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / J Severity:S2 (Serious)
Version:5.0.6 OS:Any
Assigned to: CPU Architecture:Any

[22 Oct 2007 9:17] Martin Kirchner
Description:
Hello!

It seems that there is a bug in the implicit cast of a UNION statement if the fields in the statement differ in their types.

I have a query with a UNION over two tables like this:

SELECT DISTINCT CHRISTIANNAME FROM m1.ADDRESS
UNION ALL
SELECT DISTINCT START_DT FROM m1.APPOINTMENT 

ADDRESS.CHRISTIANNAME is a VARCHAR field and APPOINTMENT.START_DT is a DATETIME.

The query works fine as long as there is no umlaut in the field CHRISTIANNAME.

If there's an umlaut I get s.th. like 'Firstname ??????????????' instead of 'Firstname äöüßÄÖÜ'. Running the query in the Query Browser works fine, too.

If I modify the query like this, it works fine with umlauts.

SELECT DISTINCT CHRISTIANNAME FROM m1.ADDRESS
UNION ALL
SELECT DISTINCT CAST(START_DT AS CHAR) FROM m1.APPOINTMENT 

Best regards

Martin

How to repeat:
Run a query with UNION and a VARCHAR and a DATETIME field.
[22 Oct 2007 9:26] Tonci Grgin
Hi Martin and thanks for your report. I am sorry but there is a lot of info / data missing here... Can you please attach/provide:
 - MySQL server version / host OS
 - my.ini/my.cnf file used for starting server
 - JDK version
 - small but complete test case (incl. connect string!)
 - data dump for tables in question (preferably in format suitable for import via mysql cl client
[22 Oct 2007 10:53] Martin Kirchner
My server version is 5.0.45 (community).
JDK: jdk1.5.0_06

Testcase:

		final String select = "SELECT DISTINCT CHRISTIANNAME FROM m1.ADDRESS WHERE GGUID = 0x648E7A2F243736918DCCC3955DC6EE7C"
				+ " UNION ALL "
				+ "SELECT DISTINCT START_DT FROM m1.APPOINTMENT WHERE GGUID = 0x648E7A2F243736918DCCC3955DC6EE7C";
		final String selectWithCast = "SELECT DISTINCT CHRISTIANNAME AS PLUSINFO FROM m1.ADDRESS WHERE GGUID = 0x648E7A2F243736918DCCC3955DC6EE7C"
				+ " UNION ALL "
				+ "SELECT DISTINCT CAST(START_DT AS CHAR) AS PLUSINFO FROM m1.APPOINTMENT WHERE GGUID = 0x648E7A2F243736918DCCC3955DC6EE7C";

		Class.forName("org.gjt.mm.mysql.Driver").newInstance();
		Connection dbConn = null;
		try {
			dbConn = DriverManager.getConnection("jdbc:mysql://localhost:3306/m1", "myUser", "myPassword");

			final Statement statement = dbConn.createStatement();
			final ResultSet rs = statement.executeQuery(select);
			while (rs.next()) {
				System.out.println("Without Cast: " + rs.getString(1));
			}

			final ResultSet rsWithCast = statement.executeQuery(selectWithCast);
			while (rsWithCast.next()) {
				System.out.println("With Cast: " + rsWithCast.getString(1));
			}

		} catch (SQLException e) {
			e.printStackTrace();
		} finally {
			if (dbConn != null) {
				try {
					dbConn.close();
				} catch (SQLException e) {
					e.printStackTrace();
				}
			}
		}
[12 Dec 2007 15:10] Florian Scandella
this is also a problem if you have a union over 2 same types of the same character encoding it it is the return value of a function call. it seems like union forgets the encoding/collation.

DROP TABLE IF EXISTS union_test;
CREATE TABLE union_test(
	id  integer,
	abc varchar(100),
	PRIMARY KEY(id)
) ENGINE = InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

insert into union_test(abc) values ('öäü');

delimiter |
DROP FUNCTION IF EXISTS get_text|
CREATE FUNCTION get_text(id integer) RETURNS VARCHAR(100) CHARACTER SET utf8
READS SQL DATA 
BEGIN
    DECLARE ret VARCHAR(100)  CHARACTER SET utf8;
    SELECT u.abc INTO ret FROM union_test u WHERE u.id=id; 
    RETURN ret;
END|
delimiter ;

select get_text(id) as name from union_test;
select get_text(id) as name from union_test union select get_text(id) as name from union_test;
select convert(get_text(id) using utf8) as name from union_test union select get_text(id) as name from union_test;
[12 Dec 2007 15:11] Florian Scandella
the results for the last script ..

+--------+
| name   |
+--------+
| öäü    | 
+--------+
1 row in set (0.01 sec)

+------+
| name |
+------+
| ���     | 
+------+
1 row in set (0.01 sec)

+--------+
| name   |
+--------+
| öäü    | 
|        | 
+--------+
2 rows in set (0.01 sec)
[13 Dec 2007 6:50] Tonci Grgin
Martin, I believe the source of this problem is described in Bug#28577 submitted by you. Let me check again.
[13 Dec 2007 12:52] Tonci Grgin
Martin, Florian, can any of you try running your queries in cl client (mysql) please but with -T option? I would like to see the output.
[13 Dec 2007 21:00] Florian Scandella
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 684532
Server version: 5.0.18

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> \. mysqlbug.sql
Query OK, 0 rows affected (0.01 sec)

Query OK, 0 rows affected (0.08 sec)

Query OK, 1 row affected (0.02 sec)

Query OK, 0 rows affected (0.02 sec)

Query OK, 0 rows affected (0.02 sec)

Field   1:  `name`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       VAR_STRING
Collation:  latin1_swedish_ci (8)
Length:     100
Max_length: 6
Decimals:   0
Flags:      

+--------+
| name   |
+--------+
| öäü    | 
+--------+
1 row in set (0.02 sec)

Field   1:  `name`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       VAR_STRING
Collation:  binary (63)
Length:     100
Max_length: 3
Decimals:   0
Flags:      BINARY 

+------+
| name |
+------+
| ���     | 
+------+
1 row in set (0.01 sec)

Field   1:  `name`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       VAR_STRING
Collation:  utf8_general_ci (33)
Length:     300
Max_length: 6
Decimals:   0
Flags:      

+--------+
| name   |
+--------+
| öäü    | 
|        | 
+--------+
2 rows in set (0.01 sec)

mysql> Bye

User time 0.01, System time 0.01
Maximum resident set size 0, Integral resident set size 0
Non-physical pagefaults 1111, Physical pagefaults 0, Swaps 0
Blocks in 0 out 0, Messages in 0 out 0, Signals 0
Voluntary context switches 88, Involuntary context switches 2
[13 Dec 2007 21:04] Florian Scandella
ok, i just realized, it works on my local mysql installation (gentoo, mysql 5.0.44) .. so it seems my problem is allready fixed in newer versions.

-----

Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.0.44-log Gentoo Linux mysql-5.0.44-r2

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> \. mysqlbug.sql
Query OK, 0 rows affected, 1 warning (0.00 sec)

Query OK, 0 rows affected (0.02 sec)

Query OK, 1 row affected (0.00 sec)

Query OK, 0 rows affected, 1 warning (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

Field   1:  `name`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       VAR_STRING
Collation:  utf8_general_ci (33)
Length:     300
Max_length: 6
Decimals:   0
Flags:      

+--------+
| name   |
+--------+
| öäü    | 
+--------+
1 row in set (0.00 sec)

Field   1:  `name`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       VAR_STRING
Collation:  utf8_general_ci (33)
Length:     300
Max_length: 6
Decimals:   0
Flags:      

+--------+
| name   |
+--------+
| öäü    | 
+--------+
1 row in set (0.00 sec)

Field   1:  `name`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       VAR_STRING
Collation:  utf8_general_ci (33)
Length:     900
Max_length: 6
Decimals:   0
Flags:      

+--------+
| name   |
+--------+
| öäü    | 
+--------+
1 row in set (0.00 sec)
[19 Dec 2007 13:39] Tonci Grgin
Thank you for your bug report. This issue has already been fixed in the latest released version of that product, which you can download at

  http://www.mysql.com/downloads/

Explanation: Closing this as per Florian's last post.