Bug #8076 inner from select column cannot be in outer select only in having part
Submitted: 21 Jan 2005 19:05 Modified: 16 Sep 2006 10:19
Reporter: [ name withheld ] Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server Severity:S4 (Feature request)
Version:MySQL 4.1.9 OS:Windows (Windows XP)
Assigned to: CPU Architecture:Any

[21 Jan 2005 19:05] [ name withheld ]
Description:
It seems that column from inner select CAN BE in HAVING part of outer select ONLY IF it is also in SELECT part of the outer select.

WORKS:
------
SELECT	1 AS outer_column, inner_column
FROM	(SELECT	2 AS inner_column) AS inner_table
GROUP BY outer_column
HAVING	inner_column>outer_column

DOESN'T WORK (#1054 - Unknown column 'inner_column' in 'having clause'):
-------------
SELECT	1 AS outer_column
FROM	(SELECT	2 AS inner_column) AS inner_table
GROUP BY outer_column
HAVING	inner_column>outer_column

Why inner_column MUST be present in SELECT part, if it is used in HAVING part?

How to repeat:
[Real life example]
Short description:
I have outer select with:
* inner select in FROM part and
* having part with column from inner select

Inner select returns 2 colums, where:
* first column is used in WHERE part of outer select and
* second column is used in HAVING part of outer select

CREATE TABLE `client` (
`client_id` INT NOT NULL AUTO_INCREMENT ,
`client_name` VARCHAR( 32 ) NOT NULL ,
`department` ENUM( 'dep1', 'dep2', 'dep3' ) NOT NULL ,
PRIMARY KEY ( `client_id` ) ,
INDEX ( `department` )
);
INSERT INTO `client` ( `client_id` , `client_name` , `department` )
VALUES	 ('', 'dep1NoCalls', 'dep1')
		,('', 'dep1ShortCalls', 'dep1')
		,('', 'dep1LongCalls', 'dep1')
		,('', 'dep2NoCalls', 'dep2')
		,('', 'dep2ShortCalls', 'dep2')
		,('', 'dep2LongCalls', 'dep2')
		,('', 'dep3NoCalls', 'dep3')
		,('', 'dep3ShortCalls', 'dep3')
		,('', 'dep3LongCalls', 'dep3')
		,('', 'dep3LongCalls2', 'dep3');

CREATE TABLE `client_call` (
`client_call_id` INT NOT NULL AUTO_INCREMENT ,
`client_id` INT NOT NULL ,
`call_start` INT NOT NULL ,
`call_duration` INT NOT NULL ,
PRIMARY KEY ( `client_call_id` ) ,
INDEX ( `client_id` )
);
INSERT INTO `client_call` ( `client_call_id` , `client_id` , `call_start` , `call_duration` )
VALUES	 ('', '2', '0', '15')
		,('', '2', '0', '20')
		,('', '3', '0', '150')
		,('', '3', '0', '200')
	 	,('', '5', '0', '1500')
		,('', '6', '0', '15000')
	 	,('', '8', '0', '1')
		,('', '8', '0', '2')
		,('', '8', '0', '3')
		,('', '9', '0', '10')
		,('', '9', '0', '20')
		,('', '10', '0', '20');

SELECT		c.client_name
			,c.department
			,department_avg_call_duration
FROM		 client			AS c
			,client_call	AS cc
			,(	SELECT		 c2.department			AS department_avg_department
							,AVG(cc2.call_duration)	AS department_avg_call_duration
				FROM		 client					AS c2
							,client_call			AS cc2
				WHERE		c2.client_id=cc2.client_id
				GROUP BY	c2.department
				HAVING		c2.department=c.department
			) AS department_avg
WHERE		cc.client_id=c.client_id
GROUP BY	c.client_id
HAVING		AVG(cc.call_duration)>1.3*department_avg_call_duration

WORKS:
------
SELECT		c.client_name
			,department_avg_call_duration
FROM		 client			AS c
			,client_call	AS cc
			,(	SELECT		 c2.department			AS department_avg_department
							,AVG(cc2.call_duration)	AS department_avg_call_duration
				FROM		 client					AS c2
							,client_call			AS cc2
				WHERE		c2.client_id=cc2.client_id
				GROUP BY	c2.department
			) AS department_avg
WHERE		cc.client_id=c.client_id
	AND		department_avg_department=c.department
GROUP BY	c.client_id
HAVING		AVG(cc.call_duration)>1.3*department_avg_call_duration

DOESN'T WORK:
-------------
SELECT		c.client_name
FROM		 client			AS c
			,client_call	AS cc
			,(	SELECT		 c2.department			AS department_avg_department
							,AVG(cc2.call_duration)	AS department_avg_call_duration
				FROM		 client					AS c2
							,client_call			AS cc2
				WHERE		c2.client_id=cc2.client_id
				GROUP BY	c2.department
			) AS department_avg
WHERE		cc.client_id=c.client_id
	AND		department_avg_department=c.department
GROUP BY	c.client_id
HAVING		AVG(cc.call_duration)>1.3*department_avg_call_duration

Why column department_avg_call_duration MUST be present in SELECT part, if it is used in HAVING part?

Suggested fix:
I think it should be possible to use column(s) from inner select in outer select having part, even if it is not present in outer select select part
[16 Sep 2006 10:19] Valeriy Kravchuk
"Inner" table in your case is a derived table, but still table. So, this:

SELECT	1 AS outer_column
FROM	(SELECT	2 AS inner_column) AS inner_table
GROUP BY outer_column
HAVING	inner_column>outer_column

is equivalent to:

mysql> create table ttt (c1 int, c2 int);
Query OK, 0 rows affected (0.03 sec)

mysql> select c1 from ttt
    -> group by c1
    -> having c2 > c1;
ERROR 1054 (42S22): Unknown column 'c2' in 'having clause'

This is a documented (and standard) behaviour (http://dev.mysql.com/doc/refman/5.1/en/select.html):

" The SQL standard requires that HAVING  must reference only columns in the GROUP BY clause or columns used in aggregate functions. However, MySQL supports an extension to this behavior, and allows HAVING to refer to columns in the SELECT list and columns in outer subqueries as well."

HAVING can not refer to something outside SELECT list.