Bug #24791 | Union with AVG-groups generates wrong results | ||
---|---|---|---|
Submitted: | 4 Dec 2006 10:02 | Modified: | 5 Apr 2007 19:57 |
Reporter: | Arjen lastname | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: General | Severity: | S3 (Non-critical) |
Version: | 4.1.21/5.0/5.1BK | OS: | Any |
Assigned to: | Martin Hansson | CPU Architecture: | Any |
[4 Dec 2006 10:02]
Arjen lastname
[4 Dec 2006 10:02]
Arjen lastname
Sample data for this issue
Attachment: union_avg_error_sample.sql (x-extension/sql, text), 11.27 KiB.
[4 Dec 2006 11:59]
Tino Zijdel
Confirmed this bug also in 4.1.22
[4 Dec 2006 20:19]
MySQL Verification Team
Thank you for the bug report. Please read: http://dev.mysql.com/doc/refman/4.1/en/problems-with-float.html please use DECIMAL data type with > 5.0 server: mysql> select version(); +---------------------+ | version() | +---------------------+ | 5.0.27-community-nt | +---------------------+ 1 row in set (0.05 sec) mysql> SELECT 7406, AVG(runtime) -> FROM test2 -> WHERE testrun_id = 7406 -> UNION -> SELECT 7499, AVG(runtime) -> FROM test2 -> WHERE testrun_id = 7499 -> ; +------+--------------------------+ | 7406 | AVG(runtime) | +------+--------------------------+ | 7406 | 10000.000000000000000000 | | 7499 | 10000.000000000000000000 | +------+--------------------------+ 2 rows in set (0.08 sec) mysql> show create table test2\G *************************** 1. row *************************** Table: test2 Create Table: CREATE TABLE `test2` ( `testrun_id` smallint(5) unsigned NOT NULL default '0', `runtime` float(20,14) default NULL ) ENGINE=MyISAM DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql> alter table test2 modify runtime decimal(20,14); Query OK, 200 rows affected (0.14 sec) Records: 200 Duplicates: 0 Warnings: 0 mysql> SELECT 7406, AVG(runtime) -> FROM test2 -> WHERE testrun_id = 7406 -> UNION -> SELECT 7499, AVG(runtime) -> FROM test2 -> WHERE testrun_id = 7499 -> ; +------+--------------------------+ | 7406 | AVG(runtime) | +------+--------------------------+ | 7406 | 13558.794316406250000000 | | 7499 | 14373.432968750000000000 | +------+--------------------------+ 2 rows in set (0.02 sec) mysql> Thanks in advance.
[4 Dec 2006 20:44]
Arjen lastname
I'm sorry for reopening this bug, but I forgot to mention something pretty important: When doing the seperate queries (SELECT 7406, AVG(runtime) FROM test2 WHERE testrun_id = 7406 and with 7499) the results are just fine, so mysql *can* produce correct results from those lists of float's. But when combined in a union the results are the ones in this report. So its not just a floating point issue with severe "rounding" errors (this is by no means a minor rounding error!) it also happens that the results of two queries unioned together are all of a sudden different then when not used in a union... that may never happen.
[6 Jan 2007 1:30]
MySQL Verification Team
Thank you for the feedback. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 4 Server version: 5.0.34-debug Source distribution Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> SELECT 7406, AVG(runtime) -> FROM test2 -> WHERE testrun_id = 7406; +------+--------------------------+ | 7406 | AVG(runtime) | +------+--------------------------+ | 7406 | 13558.794316406250800355 | +------+--------------------------+ 1 row in set (0.26 sec) mysql> SELECT 7499, AVG(runtime) -> FROM test2 -> WHERE testrun_id = 7499; +------+--------------------------+ | 7499 | AVG(runtime) | +------+--------------------------+ | 7499 | 14373.432968749999417923 | +------+--------------------------+ 1 row in set (0.01 sec) mysql> SELECT 7406, AVG(runtime) -> FROM test2 -> WHERE testrun_id = 7406 -> UNION -> SELECT 7499, AVG(runtime) -> FROM test2 -> WHERE testrun_id = 7499; +------+--------------------------+ | 7406 | AVG(runtime) | +------+--------------------------+ | 7406 | 10000.000000000000000000 | | 7499 | 10000.000000000000000000 | +------+--------------------------+ 2 rows in set (0.10 sec)
[2 Mar 2007 17:58]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/21045 ChangeSet@1.2435, 2007-03-02 19:03:01+01:00, mhansson@linux-st28.site +6 -0 Bug#24791: Union with AVG-groups generates wrong results A UNION query in MySQL creates a temporary table for each SELECT clause in the query, to store the result of the partial query. The procedure that stores these results was being too meticulous in trying to store FLOAT and DOUBLE values with exact precision, causing values that had too high precision to lose integer digits in favor of decimal digits. This can, if the maximum precision is, say, 5 significant digits, lead to a value of 123.000 to be returned from a UNION query as 10.000. The maximum precision for getting non-approximate values is hardware specific. Please refer to http://dev.mysql.com/doc/internals/en/floating-point-types.html for a discussion of this. The obvious fix is of course to switch to non-fixed precision (using approximatations for all values with too many significant digits). This is how normal INSERT statements work. Here we run into a design problem. The class Item_type_holder represents fields in the result tuple, and it is this class that has the final word in setting the number of significant digits prior to storing. However, at this point (The method Item_type_holder::join_types), all field metadata is gone. Item_type_holders are constructed from Item objects, a questionable design, but this is a fact. So there is no option but to create a means of passing the information about significant digits (or more generally speaking, minimum field length) on the Item objects. We do this with the member min_length. Hence, Item_field objects initialize their min_length (as the new field is called) from their Field's field_length, and this will be propagated all the way to the Item_type_holder through its constructor. But there is one more twist to the story: Division operations, / and AVG() increase the precision of the result according to the system variable div_prec_increment, so when the Item's that represent these operations fixate their precision, they must also increase their minimum field length.
[14 Mar 2007 17:55]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/21927 ChangeSet@1.2479, 2007-03-14 18:37:58+01:00, mhansson@linux-st28.site +8 -0 Bug #24791: Union with AVG-groups generates wrong results The problem in this bug is when we create temporary tables. When temporary tables are created for unions, there is some inferrence being carried out regarding the type of the column. Whenever this column type is inferred to be REAL (i.e. FLOAT or DOUBLE), MySQL will always try to maintain exact precision, and if that is not possible (there are hardware limits, since FLOAT and DOUBLE are stored as approximate values) will switch to using approximate values. The problem here is that at this point the information about number of significant digits is not available. Furthermore, the number of significant digits should be increased for the AVG function, however, this was not properly handled. There are 4 parts to the problem: #1: DOUBLE and FLOAT fields don't display their proper display lengths in max_display_length(). This is hard-coded as 53 for DOUBLE and 24 for FLOAT. Now changed to instead return the field_length. #2: Type holders for temporary tables do not preserve the max_length of the Item's from which they are created, and is instead reverted to the 53 and 24 from above. This causes *all* fields to get non-fixed significant digits. #3: AVG function does not update max_length (display length) when updating number of decimals. #4: The function that switches to non-fixed number of significant digits should use DBL_DIG + 2 or FLT_DIG + 2 as cut-off values (Since fixed precision does not use the 'e' notation) Of these points, #1 is the controversial one, but this change is preferred and has been cleared with Monty. The function causes quite a few unit tests to blow up and they had to b changed, but each one is annotated and motivated. We frequently see the magical 53 and 24 give way to more relevant numbers.
[22 Mar 2007 9:56]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/22579 ChangeSet@1.2479, 2007-03-22 10:56:47+01:00, mhansson@linux-st28.site +7 -0 Bug #24791: Union with AVG-groups generates wrong results The problem in this bug is when we create temporary tables. When temporary tables are created for unions, there is some inferrence being carried out regarding the type of the column. Whenever this column type is inferred to be REAL (i.e. FLOAT or DOUBLE), MySQL will always try to maintain exact precision, and if that is not possible (there are hardware limits, since FLOAT and DOUBLE are stored as approximate values) will switch to using approximate values. The problem here is that at this point the information about number of significant digits is not available. Furthermore, the number of significant digits should be increased for the AVG function, however, this was not properly handled. There are 4 parts to the problem: #1: DOUBLE and FLOAT fields don't display their proper display lengths in max_display_length(). This is hard-coded as 53 for DOUBLE and 24 for FLOAT. Now changed to instead return the field_length. #2: Type holders for temporary tables do not preserve the max_length of the Item's from which they are created, and is instead reverted to the 53 and 24 from above. This causes *all* fields to get non-fixed significant digits. #3: AVG function does not update max_length (display length) when updating number of decimals. #4: The function that switches to non-fixed number of significant digits should use DBL_DIG + 2 or FLT_DIG + 2 as cut-off values (Since fixed precision does not use the 'e' notation) Of these points, #1 is the controversial one, but this change is preferred and has been cleared with Monty. The function causes quite a few unit tests to blow up and they had to b changed, but each one is annotated and motivated. We frequently see the magical 53 and 24 give way to more relevant numbers.
[22 Mar 2007 13:57]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/22638 ChangeSet@1.2492, 2007-03-22 14:58:43+01:00, mhansson@linux-st28.site +1 -0 Bug#24791: Union with AVG-groups generates wrong results Patch appled after doing a pull from the team tree. Additional tests had to be fixed
[23 Mar 2007 11:26]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/22757 ChangeSet@1.2493, 2007-03-23 12:26:23+01:00, mhansson@linux-st28.site +1 -0 Bug #24791: Union with AVG-groups generates wrong results New test case to cover the scenario when the temp table field is of type FLOAT rather than DOUBLE (fields that stem from division or AVG are always DOUBLE)
[23 Mar 2007 13:55]
Alexey Botchkov
Pushed in 5.0.40 and 5.1.18
[5 Apr 2007 19:57]
Paul DuBois
Noted in 5.0.40, 5.1.18 changelogs. Selecting the result of AVG() within a UNION could produce incorrect values.