Bug #36829 Decimal to double conversion is inaccurate
Submitted: 20 May 2008 21:04 Modified: 16 Apr 2010 17:27
Reporter: Mark Callaghan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: General Severity:S3 (Non-critical)
Version:5.0.37, 5.0.60, 5.1.41-bzr OS:Linux
Assigned to: Alexey Botchkov CPU Architecture:Any
Tags: conversion, decimal, double, regression

[20 May 2008 21:04] Mark Callaghan
Description:
This is a regression from MySQL 4.0 to MySQL 5.0. The problem occurs because 4.0 used strtod and atof, and 5.0 provides its own implementation via my_strtod. A value that strtod/atof are able to parse correctly on most platforms is  8.8279417262832873e-139

How to repeat:
main() {
    double d = 8.8279417262832873e-139;
    double d2 = atof("8.8279417262832873e-139");
    printf("%.16e\n", d);
    printf("%.16e\n", d2);
}

$ ./d
8.8279417262832873e-139
8.8279417262832873e-139

Output from instrumented version of mysqld that displays the input (string) and output (double) from StrToDouble for the conversion.
StrToDouble converts 8.8279417262832873e-139 to 8.8279417262832859e-139

SQL for which the error occurs:
create table d (d double);
insert into d values ("8.8279417262832873e-139");

I ran the repro case using gcc 3.4, Linux 2.6 and an x86_64 server.

Suggested fix:
Use strtod/atof from glibc on the platforms for which they work. System libraries from most of the Unix vendors and Linux are ieee754 compliant. I doubt that my_strtod is in MySQL 5.0.
[20 May 2008 21:16] Mark Callaghan
I added a SQL function, ieee754_to_string, that converts double -> decimal with 17 digits of precision. This is used with mysqldump to dump tables that have columns of type Double. When the data is reloaded, the decimal -> double conversion should produce the same value as prior to the dump because double -> decimal -> double is lossless for most values when the double -> decimal conversion generates 17 digits of precision. However, that assumption has been broken by the implementation of my_strtod and now my data on reload does not match the data prior to the dump.
[20 May 2008 21:24] Mark Callaghan
And while I am at it, why does my_atof pass in a bogus pointer to the end of the string when calling my_strtod. The interface for my_strtod is that the second arg is an IN/OUT param and specifies the end of the input on entry to my_strtod. my_atof makes up a bogus end pointer -- end + 64kb. I assume this is to save the  cost of calling strlen, but it now prevents any reimplementation of my_strtod from trusting the value on input. And my_atof better hope that the arithmetic doesn't overflow.

double my_atof(const char *nptr)
{
  int error;
  const char *end= nptr+65535;                  /* Should be enough */
  return (my_strtod(nptr, (char**) &end, &error));
}
[21 May 2008 3:40] Valeriy Kravchuk
Thank you for a problem report.
[22 May 2008 0:17] Mark Callaghan
my_strntod_8bit also has code that uses pointer + 64k to guess at the end of the string.
[22 May 2008 15:16] Davi Arnaut
The problem seems to be one of precision when converting the number to a string as the text protocol uses the "%.14g" format for conversion.
[22 May 2008 15:45] Davi Arnaut
Discussing Alexey Kopytov, for the case with tables, the problem seems to lie within Field_double::val_str, which uses the "%-*.*g" when converting the double to string. The problem is that by default it uses a DBL_DIG precision, but this has been fixed in 6.0 by using dtoa which "automagically" chooses a precision.

This is a duplicate of Bug#21497.
[22 May 2008 16:16] Davi Arnaut
Also related to WL#2934
[22 May 2008 17:03] Mark Callaghan
The code that you mention in Field_double::val_str has not changed from 4.0.26 to 5.0.37:

  uint to_length=max(field_length, DOUBLE_TO_STRING_CONVERSION_BUFFER_SIZE);
  val_buffer->alloc(to_length);
  char *to=(char*) val_buffer->ptr();

  if (dec >= NOT_FIXED_DEC)
  {
    sprintf(to,"%-*.*g",(int) field_length,DBL_DIG,nr);
    to=strcend(to,' ');
  }
[22 May 2008 17:31] Mark Callaghan
The comments above about StrToDouble refer to code in my branch that consolidates common code in field.cc that calls my_strntod.

I have changed my 5.0.37 branch so that my_strtod calls strtod() and I removed all code from my_strtod. With this change, the behavior is what I expect and matches MySQL 4.0 behavior.

The problem occurs when the String literal "8.2..." is converted to a double on insert, not when the double is converted to a string during the select.
[22 May 2008 17:49] Davi Arnaut
My investigation was done on 5.1. Out of curiosity, I took your example and compiled it using my my_atof/my_atof:

int main()
{
  const char str[]= "8.8279417262832873e-139";
  double d = 8.8279417262832873e-139;
  double d2 = my_atof(str);
  printf("%s\n", str);
  printf("%.16e\n", d);
  printf("%.16e\n", d2);
  return 0;
}

davi@skynet:~/mysql-5.0-bugteam/strings$ ./str
8.8279417262832873e-139
8.8279417262832873e-139
8.8279417262832873e-139

Do you get different results?
[22 May 2008 17:57] Davi Arnaut
Taking a more closely look on my example and mysql's strtod.. yikes, you are right. There is also a problem with strtod and it's subtle, will expand on this later.
[18 Aug 2008 13:38] Alexey Botchkov
Decided no to fix this in 5.0 as changes needed are too significant.
Besides that was fixed in 6.0.
[5 Nov 2009 9:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/89435

2940 Alexey Botchkov	2009-11-05
      Bug#36829      Decimal to double conversion is inaccurate
          it's the 'string to double' conversion that isn't accurate here.
          if we use 'long double' instead of the 'double' to accumulate the
          result, the precision isn't lost.
          The 'double to string' conversion still isn't fixed in this case, but
          that's how it was in 4.0
      
      per-file comments:
        strings/strtod.c
      Bug#36829      Decimal to double conversion is inaccurate
         long double used to accumulate the result
[17 Dec 2009 12:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/94734

2940 Alexey Botchkov	2009-12-16
      Bug#36829      Decimal to double conversion is inaccurate
          it's the 'string to double' conversion that isn't accurate here.
          if we use 'long double' instead of the 'double' to accumulate the
          result, the precision isn't lost.
          The 'double to string' conversion still isn't fixed in this case, but
          that's how it was in 4.0
      
      per-file comments:
        strings/strtod.c
      Bug#36829      Decimal to double conversion is inaccurate
         long double used to accumulate the result
[29 Jan 2010 12:24] Alexey Kopytov
'long double' is non-standard and platform-specific, we should not use it. Even on x86, 'long double' is a synonym for 'double' on Windows. 

If the patch for this bug had a test case it would immediately reveal that after the patch the bug is still present on Windows.

I think the proper fix is to backport the dtoa's version of strtod() and get rid of our own one. Replacing the code doing string-to-number conversion is much less intrusive and less dangerous than changing string-to-number conversion. IIRC, in the dtoa integration patch, almost all changes in the test suite were caused by the difference in number-to-string conversions, only a few test cases were affected by the different string-to-number conversion routine.
[29 Jan 2010 12:27] Alexey Kopytov
Correction: the sentence in the previous comment should be "Replacing the code doing string-to-number conversion is much less intrusive and
less dangerous than changing number-to-number conversion".
[25 Mar 2010 11:03] Alexey Botchkov
fixed by the backported patch for WL#2934
[16 Apr 2010 17:27] Paul DuBois
This is part of the inclusion of dtoa in 5.5.3. No separate changelog entry.