Bug #24791 Union with AVG-groups generates wrong results
Submitted: 4 Dec 2006 10:02 Modified: 5 Apr 2007 19:57
Reporter: Arjen lastname Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: General Severity:S3 (Non-critical)
Version:4.1.21/5.0/5.1BK OS:Any
Assigned to: Martin Hansson CPU Architecture:Any

[4 Dec 2006 10:02] Arjen lastname
Description:
When doing this query on the supplied test-table, the result of the AVG's are wrong:
SELECT 7406, AVG(runtime)
FROM test2
WHERE testrun_id = 7406
UNION
SELECT 7499, AVG(runtime)
FROM test2
WHERE testrun_id = 7499

The result is:
7406  	10000.000000000000000000
7499 	10000.000000000000000000

While it should be:
7406  	13558.794316406250800355
7499    14373.432968749999417923

With only 10 records per testrun its ok.

How to repeat:
Load the attached table and do the above query
[4 Dec 2006 10:02] Arjen lastname
Sample data for this issue

Attachment: union_avg_error_sample.sql (x-extension/sql, text), 11.27 KiB.

[4 Dec 2006 11:59] Tino Zijdel
Confirmed this bug also in 4.1.22
[4 Dec 2006 20:19] Miguel Solorzano
Thank you for the bug report. Please read:

http://dev.mysql.com/doc/refman/4.1/en/problems-with-float.html

please use DECIMAL data type with > 5.0 server:

mysql> select version();
+---------------------+
| version()           |
+---------------------+
| 5.0.27-community-nt |
+---------------------+
1 row in set (0.05 sec)

mysql> SELECT 7406, AVG(runtime)
    -> FROM test2
    -> WHERE testrun_id = 7406
    -> UNION
    -> SELECT 7499, AVG(runtime)
    -> FROM test2
    -> WHERE testrun_id = 7499
    -> ;
+------+--------------------------+
| 7406 | AVG(runtime)             |
+------+--------------------------+
| 7406 | 10000.000000000000000000 |
| 7499 | 10000.000000000000000000 |
+------+--------------------------+
2 rows in set (0.08 sec)

mysql> show create table test2\G
*************************** 1. row ***************************
       Table: test2
Create Table: CREATE TABLE `test2` (
  `testrun_id` smallint(5) unsigned NOT NULL default '0',
  `runtime` float(20,14) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql> alter table test2 modify runtime decimal(20,14);
Query OK, 200 rows affected (0.14 sec)
Records: 200  Duplicates: 0  Warnings: 0

mysql> SELECT 7406, AVG(runtime)
    -> FROM test2
    -> WHERE testrun_id = 7406
    -> UNION
    -> SELECT 7499, AVG(runtime)
    -> FROM test2
    -> WHERE testrun_id = 7499
    -> ;
+------+--------------------------+
| 7406 | AVG(runtime)             |
+------+--------------------------+
| 7406 | 13558.794316406250000000 |
| 7499 | 14373.432968750000000000 |
+------+--------------------------+
2 rows in set (0.02 sec)

mysql>

Thanks in advance.
[4 Dec 2006 20:44] Arjen lastname
I'm sorry for reopening this bug, but I forgot to mention something pretty important:

When doing the seperate queries (SELECT 7406, AVG(runtime)
FROM test2 WHERE testrun_id = 7406 and with 7499) the results are just fine, so mysql *can* produce correct results from those lists of float's. But when combined in a union the results are the ones in this report.

So its not just a floating point issue with severe "rounding" errors (this is by no means a minor rounding error!) it also happens that the results of two queries unioned together are all of a sudden different then when not used in a union... that may never happen.
[6 Jan 2007 1:30] Miguel Solorzano
Thank you for the feedback.

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.0.34-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> SELECT 7406, AVG(runtime)
    -> FROM test2
    -> WHERE testrun_id = 7406;
+------+--------------------------+
| 7406 | AVG(runtime)             |
+------+--------------------------+
| 7406 | 13558.794316406250800355 | 
+------+--------------------------+
1 row in set (0.26 sec)

mysql> SELECT 7499, AVG(runtime)
    -> FROM test2
    -> WHERE testrun_id = 7499;
+------+--------------------------+
| 7499 | AVG(runtime)             |
+------+--------------------------+
| 7499 | 14373.432968749999417923 | 
+------+--------------------------+
1 row in set (0.01 sec)

mysql> SELECT 7406, AVG(runtime)
    -> FROM test2
    -> WHERE testrun_id = 7406
    -> UNION
    -> SELECT 7499, AVG(runtime)
    -> FROM test2
    -> WHERE testrun_id = 7499;
+------+--------------------------+
| 7406 | AVG(runtime)             |
+------+--------------------------+
| 7406 | 10000.000000000000000000 | 
| 7499 | 10000.000000000000000000 | 
+------+--------------------------+
2 rows in set (0.10 sec)
[2 Mar 2007 17:58] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/21045

ChangeSet@1.2435, 2007-03-02 19:03:01+01:00, mhansson@linux-st28.site +6 -0
  Bug#24791: Union with AVG-groups generates wrong results
  
  A UNION query in MySQL creates a temporary table for each SELECT clause
  in the query, to store the result of the partial query. The procedure 
  that stores these results was being too meticulous in trying to store 
  FLOAT and DOUBLE values with exact precision, causing values that had too
  high precision to lose integer digits in favor of decimal digits. This 
  can, if the maximum precision is, say, 5 significant digits, lead to a 
  value of 123.000 to be returned from a UNION query as 10.000. 
  
  The maximum precision for getting non-approximate values is hardware 
  specific. Please refer to
  http://dev.mysql.com/doc/internals/en/floating-point-types.html
  for a discussion of this.
  
  The obvious fix is of course to switch to non-fixed precision
  (using approximatations for all values with too many significant
  digits). This is how normal INSERT statements work. 
  
  Here we run into a design problem. The class Item_type_holder 
  represents fields in the result tuple, and it is this class that has
  the final word in setting the number of significant digits prior to
  storing. However, at this point (The method
  Item_type_holder::join_types), all field metadata is gone.
  
  Item_type_holders are constructed from Item objects, a questionable
  design, but this is a fact. So there is no option but to create a 
  means of passing the information about significant digits (or more
  generally speaking, minimum field length) on the Item objects. We do this
  with the member min_length.
  
  Hence, Item_field objects initialize their min_length (as the new field
  is called) from their Field's field_length, and this will be propagated
  all the way to the Item_type_holder through its constructor.
  
  But there is one more twist to the story: Division operations, / and 
  AVG() increase the precision of the result according to the system
  variable div_prec_increment, so when the Item's that represent these
  operations fixate their precision, they must also increase their
  minimum field length.
[14 Mar 2007 17:55] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/21927

ChangeSet@1.2479, 2007-03-14 18:37:58+01:00, mhansson@linux-st28.site +8 -0
  Bug #24791: Union with AVG-groups generates wrong results
  
  The problem in this bug is when we create temporary tables. When
  temporary tables are created for unions, there is some 
  inferrence being carried out regarding the type of the column.
  Whenever this column type is inferred to be REAL (i.e. FLOAT or
  DOUBLE), MySQL will always try to maintain exact precision, and
  if that is not possible (there are hardware limits, since FLOAT
  and DOUBLE are stored as approximate values) will switch to
  using approximate values. The problem here is that at this point
  the information about number of significant digits is not 
  available. Furthermore, the number of significant digits should
  be increased for the AVG function, however, this was not properly 
  handled. There are 4 parts to the problem:
  
  #1: DOUBLE and FLOAT fields don't display their proper display 
  lengths in max_display_length(). This is hard-coded as 53 for 
  DOUBLE and 24 for FLOAT. Now changed to instead return the 
  field_length.
  
  #2: Type holders for temporary tables do not preserve the 
  max_length of the Item's from which they are created, and is 
  instead reverted to the 53 and 24 from above. This causes 
  *all* fields to get non-fixed significant digits.
  
  #3: AVG function does not update max_length (display length)
  when updating number of decimals.
  
  #4: The function that switches to non-fixed number of 
  significant digits should use DBL_DIG + 2 or FLT_DIG + 2 as 
  cut-off values (Since fixed precision does not use the 'e' 
  notation)
  
  Of these points, #1 is the controversial one, but this 
  change is preferred and has been cleared with Monty. The 
  function causes quite a few unit tests to blow up and they had
  to b changed, but each one is annotated and motivated. We 
  frequently see the magical 53 and 24 give way to more relevant
  numbers.
[22 Mar 2007 9:56] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/22579

ChangeSet@1.2479, 2007-03-22 10:56:47+01:00, mhansson@linux-st28.site +7 -0
  Bug #24791: Union with AVG-groups generates wrong results
  
  The problem in this bug is when we create temporary tables. When
  temporary tables are created for unions, there is some 
  inferrence being carried out regarding the type of the column.
  Whenever this column type is inferred to be REAL (i.e. FLOAT or
  DOUBLE), MySQL will always try to maintain exact precision, and
  if that is not possible (there are hardware limits, since FLOAT
  and DOUBLE are stored as approximate values) will switch to
  using approximate values. The problem here is that at this point
  the information about number of significant digits is not 
  available. Furthermore, the number of significant digits should
  be increased for the AVG function, however, this was not properly 
  handled. There are 4 parts to the problem:
  
  #1: DOUBLE and FLOAT fields don't display their proper display 
  lengths in max_display_length(). This is hard-coded as 53 for 
  DOUBLE and 24 for FLOAT. Now changed to instead return the 
  field_length.
  
  #2: Type holders for temporary tables do not preserve the 
  max_length of the Item's from which they are created, and is 
  instead reverted to the 53 and 24 from above. This causes 
  *all* fields to get non-fixed significant digits.
  
  #3: AVG function does not update max_length (display length)
  when updating number of decimals.
  
  #4: The function that switches to non-fixed number of 
  significant digits should use DBL_DIG + 2 or FLT_DIG + 2 as 
  cut-off values (Since fixed precision does not use the 'e' 
  notation)
  
  Of these points, #1 is the controversial one, but this 
  change is preferred and has been cleared with Monty. The 
  function causes quite a few unit tests to blow up and they had
  to b changed, but each one is annotated and motivated. We 
  frequently see the magical 53 and 24 give way to more relevant
  numbers.
[22 Mar 2007 13:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/22638

ChangeSet@1.2492, 2007-03-22 14:58:43+01:00, mhansson@linux-st28.site +1 -0
  Bug#24791: Union with AVG-groups generates wrong results
  
  Patch appled after doing a pull from the team tree. Additional tests had to be
  fixed
[23 Mar 2007 11:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/22757

ChangeSet@1.2493, 2007-03-23 12:26:23+01:00, mhansson@linux-st28.site +1 -0
  Bug #24791: Union with AVG-groups generates wrong results
  
  New test case to cover the scenario when the temp table field is of type FLOAT
  rather than DOUBLE (fields that stem from division or AVG are always DOUBLE)
[23 Mar 2007 13:55] Alexey Botchkov
Pushed in 5.0.40 and 5.1.18
[5 Apr 2007 19:57] Paul Dubois
Noted in 5.0.40, 5.1.18 changelogs.

Selecting the result of AVG() within a UNION could produce incorrect
values.