Bug #80660 MySQL could detect functional dependency on group expressions
Submitted: 8 Mar 2016 16:38
Reporter: Guilhem Bichot Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S4 (Feature request)
Version:5.7 OS:Any
Assigned to: CPU Architecture:Any

[8 Mar 2016 16:38] Guilhem Bichot
Description:
Since MySQL 5.7, the server supports detection that expressions belonging to the SELECT list and HAVING clause and ORDER BY clause are functionally dependent on columns listed in the GROUP BY clause. This is an optional feature of SQL2011.

Moreover, MySQL has always allowed grouping on expressions, like:
select year(a), count(b) from t group by year(a);
This is a non-standard query ; SQL2011 imposes that the GROUP BY must only list columns.
MySQL accepts it but requires that selected expressions are identical to group expressions.

When a query mixes the two features together, MySQL doesn't try to spot functional dependencies between an _expression_ in GROUP BY and an expression belonging to the SELECT list (or HAVING/ORDER BY), so
  select 1+year(a), count(b) from t group by year(a);
will be rejected with error, unless only_full_group_by is turned off from sql_mode. MySQL doesn't try to see that 1+year(a) is a deterministic expression of the value of year(a). MySQL doesn't analyze selected expressions to see if they contain only constants and year(a) (and potentially other group expressions if GROUP BY has several of them).

A workaround is to name the group expression, then MySQL treats it as a "pseudo-column" and spots it in selected expressions:
  select year(a) as group_exp, 1+group_exp, count(b) from t group by group_exp;
is accepted.
It is non-standard to reference aliases in GROUP BY like this, on the other hand it's non-standard to group on an expression, so Standard compliance of the workaround is of no concern here.

A more common example, used to decide if NULLs should be first or last:
  select year(a), count(b) from t group by year(a) order by isnull(year(a));
Workaround:
  select year(a) as group_exp, count(b) from t group by group_exp
  order by isnull(group_exp);

The request is to accept the rejected queries. Inspired by:
http://mysqlserverteam.com/when-only_full_group_by-wont-see-the-query-is-deterministic/#co...

For HAVING there is another feature request to implement first: bug#24083 .
DDL for the table: create table t(a date, b int);

How to repeat:
see description

Suggested fix:
Walk an expression in the SELECT / HAVING / ORDER BY clause: ignore constants, match non-constants with the GROUP BY expressions (Item::eq()).

Do not try to be as smart as we are with group columns; for example, do not try to analyze equalities in WHERE to see that

  select year(a), count(b) from t
  where md5(a)=year(a)
  group by year(a) order by isnull(md5(a));

is an "acceptable" query because ORDER BY depends on something (MD5) which is equal to the group expression (YEAR).
[8 Mar 2016 16:52] Guilhem Bichot
To clarify the WHERE case in "suggested fix": if grouping on column, this is currently recognized as valid:

  select year(a), count(b) from t
  where a=c
  group by a order by isnull(c);