MySQL Bugs: #92209: AVG(YEAR(datetime_field)) makes an error result beacuse of overflow

Bug #92209	AVG(YEAR(datetime_field)) makes an error result beacuse of overflow
Submitted:	28 Aug 2018 11:08	Modified:	27 Nov 2018 18:12
Reporter:	Pin Lin	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S2 (Serious)
Version:	8.0.12, 5.7.23, 5.6.41, 5.5.61	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
when I execute a query as follows, aggregate function AVG(YEAR(xx)) makes an error result.
mysql> select count(distinct name),avg(year(date1))as aa,class from linpin.t1 group by class order by aa;
+----------------------+----------+--------+
| count(distinct name) | aa       | class  |
+----------------------+----------+--------+
|                    2 | 999.9999 | class1 |
|                    2 | 999.9999 | class2 |
+----------------------+----------+--------+

The query above should produce a result as fllows.
mysql> select count(distinct name),avg(year(date1))as aa,class from linpin.t1 group by class order by aa;
+----------------------+-----------+--------+
| count(distinct name) | aa        | class  |
+----------------------+-----------+--------+
|                    2 | 2002.0000 | class2 |
|                    2 | 2013.5000 | class1 |
+----------------------+-----------+--------+

Obviously, Overflow occurred during the calculation of avg(year(date1)).

How to repeat:
1. create a table and insert some rows
create table if not exists linpin.t1(id int key,name varchar(10),class char(10), date1 datetime);

insert into linpin.t1(id,name,class,date1)values(1,'zhangsan','class1','2014-01-14 15:32:17');
insert into linpin.t1(id,name,class,date1)values(2,'lisi','class1','2013-12-14 10:21:27');
insert into linpin.t1(id,name,class,date1)values(3,'wangwu','class2','2003-05-21 08:25:14');
insert into linpin.t1(id,name,class,date1)values(4,'zhaoliu','class2','2001-07-19 09:35:18');

2. execute the query as follows
mysql> select count(distinct name),avg(year(date1))as aa,class from linpin.t1 group by class order by aa; 
+----------------------+----------+--------+
| count(distinct name) | aa       | class  |
+----------------------+----------+--------+
|                    2 | 999.9999 | class1 |
|                    2 | 999.9999 | class2 |
+----------------------+----------+--------+

Suggested fix:
Mysql creates a temp table for filesort and create a Field_new_decimal object for avg(year(date1)).Precision of this decimal field is 7, the number of dicimal digit is 4, the number of Integer digit is 3.

The result of avg(year(date1)) is 2002.0000 and 2013.5000 ,the number of integer digit is 4.

The result of avg(year(date1)) overflows.

The error was made in code as follows.

Field *Field_new_decimal::create_from_item (Item *item)
{
	uint8 dec= item->decimals;  // dec = 4
	uint32 len= item->max_char_length(); // len = 9
	return new Field_new_decimal(len, item->maybe_null, item->item_name.ptr(), dec, item->unsigned_flag);  
}

Field_new_decimal::Field_new_decimal(uchar *ptr_arg,
                                     uint32 len_arg, uchar *null_ptr_arg,
                                     uchar null_bit_arg,
                                     enum utype unireg_check_arg,
                                     const char *field_name_arg,
                                     uint8 dec_arg,bool zero_arg,
                                     bool unsigned_arg)
  :Field_num(ptr_arg, len_arg, null_ptr_arg, null_bit_arg,
             unireg_check_arg, field_name_arg, dec_arg, zero_arg, unsigned_arg)
{
  // precision=7,dec_arg=4, so intg = 7-4 = 3
  precision= my_decimal_length_to_precision(len_arg, dec_arg, unsigned_arg);
}

inline uint my_decimal_length_to_precision(uint length, uint scale,
                                           bool unsigned_flag)
{
  /* Precision can't be negative thus ignore unsigned_flag when length is 0. */
 // precision = 9 - 1 - 1 = 7
  uint retval= (uint) (length - (scale>0 ? 1:0) -
                 (unsigned_flag || !length ? 0:1));
  return retval;
}

The precision of avg(year(date1)) comes from year(date1). and the length of year() is fixed as 4.
I think he length of year() shoulde be fixed as 5 because of sign for "+" and "-".

void Item_func_year::fix_length_and_dec()
{ 
  /* if fix_char_length(5), it doesn't overflow
  fix_char_length(4); /* 9999, sign*/
  maybe_null=1;
}

Hello Pin Lin,

Thank you for the report!

regards,
Umesh

Documented fix as follows in the MySQL 8.0.14 changelog:

    An overflow occurred when calculating
    AVG(YEAR(datetime_column)).

Closed.

BUG#95045 is a regression of this fix.