MySQL Bugs: #43432: Union on floats does unnecessary rounding

Bug #43432	Union on floats does unnecessary rounding
Submitted:	5 Mar 2009 20:24	Modified:	15 May 2009 1:56
Reporter:	Leandro Morgado	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: DML	Severity:	S2 (Serious)
Version:	5.0.51, 5.1.31	OS:	Any
Assigned to:	Alexey Kopytov	CPU Architecture:	Any

Description:
Doing a UNION on FLOAT(9,6) rounds off the result to FLOAT(6,4) and FLOAT(6.5) unnecessarily. The manual says UNION will get the data types from the first select, which doesn't matter in this case as the tables has identical structure.

How to repeat:
CREATE TABLE `t1` (
`f` float(9,6) NOT NULL DEFAULT '0.000000'
);

INSERT INTO t1 VALUES(42.123456);

CREATE TABLE `t2` (
`f` float(9,6) NOT NULL DEFAULT '0.000000'
);

INSERT INTO t2 VALUES(-8.123456);

Sending the next queries, you will have the next results:
SELECT f FROM t1;
+-----------+
| f |
+-----------+
| 42.123455 |
+-----------+

SELECT f FROM t2;
+-----------+
| f |
+-----------+
| -8.123456 |
+-----------+

But if you send the next query, you will have the next result:
SELECT f FROM t1 UNION ALL SELECT f FROM t2;
+----------+
| f |
+----------+
| 42.1235 |
| -8.12346 |
+----------+

Suggested fix:
"SELECT f FROM t1 UNION ALL SELECT f FROM t2" should return:

+-----------+
| f |
+-----------+
| 42.123455 |
| -8.123456 |
+-----------+

This is not a bug in UNION, SELECT with a plain (not "fixed precision") FLOAT returns the same results:

mysql> create table t2 (f float(9,6), f1 float);

mysql> insert into t2 values (-8.123456, -8.123456);           

mysql> insert into t2 values (42.123456, 42.123456);

mysql> select * from t2;
+-----------+----------+
| f         | f1       |
+-----------+----------+
| -8.123456 | -8.12346 | 
| 42.123455 |  42.1235 | 
+-----------+----------+

And the reason why we print not-fixed-precision FLOATs and DOUBLEs with a maximum of 6 and 15 significant digits respectively is the hardware limits. Those values are defined as FLT_DIG and DBL_DIG constants in limits.h. Quoting http://www.gnu.org/software/libtool/manual/libc/Floating-Point-Parameters.html:

"
FLT_DIG
    This is the number of decimal digits of precision for the float data type. Technically, if p and b are the precision and base (respectively) for the representation, then the decimal precision q is the maximum number of decimal digits such that any floating point number with q base 10 digits can be rounded to a floating point number with p base b digits and back again, without change to the q decimal digits.

    The value of this macro is supposed to be at least 6, to satisfy ISO C.

DBL_DIG
LDBL_DIG
    These are similar to FLT_DIG, but for the data types double and long double, respectively. The values of these macros are supposed to be at least 10.
"

Now the problem with the non-standard FLOAT(M,D) and DOUBLE(M,D) syntax is that MySQL always displays them with D digits after the decimal point without consideration whether CPU can actually provide such precision. Also note that FLT_DIG and DBL_DIG refer to the total number of significant digits, not just those after the decimal point.

We cannot change the way we display FLOAT(M,D) and DOUBLE(M,D) since, as I understand it, it is a part of specification for these data types.

As a result, defining columns as FLOAT(M,D) with M > 6, or DOUBLE(M,D) with M > 15 does not make sense as the hardware cannot provide the required precision, not for all cases at least. The server could enforce that such columns can only be created with a meaningful precision specification, but I think that's a separate feature request that should be discussed separately, and will definitely not go into a stable release.

Speaking of the result difference in the test case for this bug, the reason is that UNION does data type aggregation which converts all fixed precision FLOAT/DOUBLE columns to non-fixed ones, since there may be multiple fixed-precision columns with different precisions in UNION. Therefore, FLOAT(9,6) ends up being displayed as a FLOAT value with 6 significant digits.

More thoughts after closer code inspection:

There are lots of inconsistencies in how FLOAT(M,D), DOUBLE(M,D) and DECIMAL(M,D) are handled internally. For example, some code expects M to be the total number of digits (as defined by the manual), while some code expects M to be the field width in characters. There are more examples of inconsistency. All that asks for a massive overhaul.

Even though my point about FLOAT(M,D)/DOUBLE(M,D) with M > FLT_DIG/DBL_DIG being meaningless still remains, for the time being it would be possible to provide a simple fix which ensures that data types of columns in a UNION result are preserved if data types of the corresponding columns in all SELECT statements are FLOAT(M,D) or DOUBLE(M,D) with the same M and D. We don't state such behavior in the manual, but on the other hand the manual does not claim otherwise. All the manual says in the regards is:

1. "Selected columns listed in corresponding positions of each SELECT statement should have the same data type."

Although the server does not enforce the above rule, so it's not clear how "should" in the above statement should be interpreted.

2. "If the data types of corresponding SELECT columns do not match, the types and lengths of the columns in the UNION result take into account the values retrieved by all of the SELECT statements."

Note that the proposed fix will not solve the UNION data type handling in all possible combinations, just the particular one used as a test case for this bug. Examples:

mysql> create table t1 (f float(9,6), f1 float, d decimal(9,6));

mysql> create table t2 as select f from t1 union select f1 from t1;

mysql> show create table t2\G -- no reasons to convert to float, we could get by with float(9,6)
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `f` float default NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1

mysql> create table t3 as select f from t1 union select d from t1;

mysql> show create table t3\G -- no reasons to convert to double(11,6)
*************************** 1. row ***************************
       Table: t3
Create Table: CREATE TABLE `t3` (
  `f` double(11,6) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1

mysql> show create table t4\G -- no reasons to convert to float
*************************** 1. row ***************************
       Table: t4
Create Table: CREATE TABLE `t4` (
  `f` float default NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1

So from the examples above, only the last case will be fixed to return float(9,6) as the result data type.

The examples in the last comment miss the 'create table' statement for t4:

create table t4 as select f from t1 union select f from t1;

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/70642

2774 Alexey Kopytov	2009-03-27
      Fix for bug #43432:  Union on floats does unnecessary rounding  
        
      UNION could convert fixed-point FLOAT(M,D)/DOUBLE(M,D) columns  
      to FLOAT/DOUBLE when aggregating data types from the SELECT  
      substatements. While there is nothing particularly wrong with  
      this behavior, especially when M is greater than the hardware  
      precision limits, it could be confusing in cases when all  
      SELECT statements in a union have the same  
      FLOAT(M,D)/DOUBLE(M,D) columns with equal precision  
      specifications listed in the same position.  
        
      Since the manual is quite vague on what data type should be  
      returned in such cases, the bug was fixed by implementing the  
      most 'expected' behavior: do not convert FLOAT(M,D)/DOUBLE(M,D)  
      to anything else if all SELECT statements in a UNION have the  
      same precision for that column.  
      modified:
        mysql-test/r/union.result
        mysql-test/t/union.test
        sql/field.cc
        sql/item.cc
        sql/mysql_priv.h

Pushed into 5.0.82 (revid:davi.arnaut@sun.com-20090505184158-dvmedh8n472y8np5) (version source revid:davi.arnaut@sun.com-20090505184158-dvmedh8n472y8np5) (merge vers: 5.0.82) (pib:6)

Pushed into 5.1.35 (revid:davi.arnaut@sun.com-20090505190206-9xmh7dlc6kom8exp) (version source revid:davi.arnaut@sun.com-20090505190206-9xmh7dlc6kom8exp) (merge vers: 5.1.35) (pib:6)

Pushed into 6.0.12-alpha (revid:svoj@sun.com-20090506125450-yokcmvqf2g7jhujq) (version source revid:horst@mysql.com-20090327184517-25eq077q2beocs6y) (merge vers: 6.0.11-alpha) (pib:6)

Noted in 5.0.82, 5.1.35, 6.0.12 changelogs.

UNION of floating-point numbers did unnecessary rounding.

Pushed into 5.1.35-ndb-6.3.26 (revid:jonas@mysql.com-20090615074202-0r5r2jmi83tww6sf) (version source revid:jonas@mysql.com-20090615070837-9pccutgc7repvb4d) (merge vers: 5.1.35-ndb-6.3.26) (pib:6)

Pushed into 5.1.35-ndb-7.0.7 (revid:jonas@mysql.com-20090615074335-9hcltksp5cu5fucn) (version source revid:jonas@mysql.com-20090615072714-rmfkvrbbipd9r32c) (merge vers: 5.1.35-ndb-7.0.7) (pib:6)

Pushed into 5.1.35-ndb-6.2.19 (revid:jonas@mysql.com-20090615061520-sq7ds4yw299ggugm) (version source revid:jonas@mysql.com-20090615054654-ebgpz7elwu1xj36j) (merge vers: 5.1.35-ndb-6.2.19) (pib:6)