Bug #92209 AVG(YEAR(datetime_field)) makes an error result beacuse of overflow
Submitted: 28 Aug 11:08 Modified: 28 Aug 12:53
Reporter: Pin Lin Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:8.0.12, 5.7.23, 5.6.41, 5.5.61 OS:Any
Assigned to: CPU Architecture:Any

[28 Aug 11:08] Pin Lin
Description:
when I execute a query as follows, aggregate function AVG(YEAR(xx)) makes an error result.
mysql> select count(distinct name),avg(year(date1))as aa,class from linpin.t1 group by class order by aa;
+----------------------+----------+--------+
| count(distinct name) | aa       | class  |
+----------------------+----------+--------+
|                    2 | 999.9999 | class1 |
|                    2 | 999.9999 | class2 |
+----------------------+----------+--------+

The query above should produce a result as fllows.
mysql> select count(distinct name),avg(year(date1))as aa,class from linpin.t1 group by class order by aa;
+----------------------+-----------+--------+
| count(distinct name) | aa        | class  |
+----------------------+-----------+--------+
|                    2 | 2002.0000 | class2 |
|                    2 | 2013.5000 | class1 |
+----------------------+-----------+--------+

Obviously, Overflow occurred during the calculation of avg(year(date1)).

How to repeat:
1. create a table and insert some rows
create table if not exists linpin.t1(id int key,name varchar(10),class char(10), date1 datetime);

insert into linpin.t1(id,name,class,date1)values(1,'zhangsan','class1','2014-01-14 15:32:17');
insert into linpin.t1(id,name,class,date1)values(2,'lisi','class1','2013-12-14 10:21:27');
insert into linpin.t1(id,name,class,date1)values(3,'wangwu','class2','2003-05-21 08:25:14');
insert into linpin.t1(id,name,class,date1)values(4,'zhaoliu','class2','2001-07-19 09:35:18');

2. execute the query as follows
mysql> select count(distinct name),avg(year(date1))as aa,class from linpin.t1 group by class order by aa; 
+----------------------+----------+--------+
| count(distinct name) | aa       | class  |
+----------------------+----------+--------+
|                    2 | 999.9999 | class1 |
|                    2 | 999.9999 | class2 |
+----------------------+----------+--------+

Suggested fix:
Mysql creates a temp table for filesort and create a Field_new_decimal object for avg(year(date1)).Precision of this decimal field is 7, the number of dicimal digit is 4, the number of Integer digit is 3.

The result of avg(year(date1)) is 2002.0000 and 2013.5000 ,the number of integer digit is 4.

The result of avg(year(date1)) overflows.

The error was made in code as follows.

Field *Field_new_decimal::create_from_item (Item *item)
{
	uint8 dec= item->decimals;  // dec = 4
	uint32 len= item->max_char_length(); // len = 9
	return new Field_new_decimal(len, item->maybe_null, item->item_name.ptr(), dec, item->unsigned_flag);  
}

Field_new_decimal::Field_new_decimal(uchar *ptr_arg,
                                     uint32 len_arg, uchar *null_ptr_arg,
                                     uchar null_bit_arg,
                                     enum utype unireg_check_arg,
                                     const char *field_name_arg,
                                     uint8 dec_arg,bool zero_arg,
                                     bool unsigned_arg)
  :Field_num(ptr_arg, len_arg, null_ptr_arg, null_bit_arg,
             unireg_check_arg, field_name_arg, dec_arg, zero_arg, unsigned_arg)
{
  // precision=7,dec_arg=4, so intg = 7-4 = 3
  precision= my_decimal_length_to_precision(len_arg, dec_arg, unsigned_arg);
}

inline uint my_decimal_length_to_precision(uint length, uint scale,
                                           bool unsigned_flag)
{
  /* Precision can't be negative thus ignore unsigned_flag when length is 0. */
 // precision = 9 - 1 - 1 = 7
  uint retval= (uint) (length - (scale>0 ? 1:0) -
                 (unsigned_flag || !length ? 0:1));
  return retval;
}

The precision of avg(year(date1)) comes from year(date1). and the length of year() is fixed as 4.
I think he length of year() shoulde be fixed as 5 because of sign for "+" and "-".

void Item_func_year::fix_length_and_dec()
{ 
  /* if fix_char_length(5), it doesn't overflow
  fix_char_length(4); /* 9999, sign*/
  maybe_null=1;
}
[28 Aug 12:53] Umesh Shastry
Hello Pin Lin,

Thank you for the report!

regards,
Umesh