Bug #36829 Decimal to double conversion is inaccurate
Submitted: 20 May 2008 23:04 Modified: 18 Aug 2008 15:38
Reporter: Mark Callaghan
Status: Won't fix
Category:Server Severity:S3 (Non-critical)
Version:5.0.37, 5.0.60 OS:Linux
Assigned to: Alexey Botchkov Target Version:5.1+
Tags: conversion, double, decimal, regression
Triage: D2 (Serious) / R4 (High) / E3 (Medium)

[20 May 2008 23:04] Mark Callaghan
Description:
This is a regression from MySQL 4.0 to MySQL 5.0. The problem occurs because 4.0 used
strtod and atof, and 5.0 provides its own implementation via my_strtod. A value that
strtod/atof are able to parse correctly on most platforms is  8.8279417262832873e-139

How to repeat:
main() {
    double d = 8.8279417262832873e-139;
    double d2 = atof("8.8279417262832873e-139");
    printf("%.16e\n", d);
    printf("%.16e\n", d2);
}

$ ./d
8.8279417262832873e-139
8.8279417262832873e-139

Output from instrumented version of mysqld that displays the input (string) and output
(double) from StrToDouble for the conversion.
StrToDouble converts 8.8279417262832873e-139 to 8.8279417262832859e-139

SQL for which the error occurs:
create table d (d double);
insert into d values ("8.8279417262832873e-139");

I ran the repro case using gcc 3.4, Linux 2.6 and an x86_64 server.

Suggested fix:
Use strtod/atof from glibc on the platforms for which they work. System libraries from
most of the Unix vendors and Linux are ieee754 compliant. I doubt that my_strtod is in
MySQL 5.0.
[20 May 2008 23:16] Mark Callaghan
I added a SQL function, ieee754_to_string, that converts double -> decimal with 17 digits
of precision. This is used with mysqldump to dump tables that have columns of type Double.
When the data is reloaded, the decimal -> double conversion should produce the same value
as prior to the dump because double -> decimal -> double is lossless for most values when
the double -> decimal conversion generates 17 digits of precision. However, that
assumption has been broken by the implementation of my_strtod and now my data on reload
does not match the data prior to the dump.
[20 May 2008 23:24] Mark Callaghan
And while I am at it, why does my_atof pass in a bogus pointer to the end of the string
when calling my_strtod. The interface for my_strtod is that the second arg is an IN/OUT
param and specifies the end of the input on entry to my_strtod. my_atof makes up a bogus
end pointer -- end + 64kb. I assume this is to save the  cost of calling strlen, but it
now prevents any reimplementation of my_strtod from trusting the value on input. And
my_atof better hope that the arithmetic doesn't overflow.

double my_atof(const char *nptr)
{
  int error;
  const char *end= nptr+65535;                  /* Should be enough */
  return (my_strtod(nptr, (char**) &end, &error));
}
[21 May 2008 5:40] Valeriy Kravchuk
Thank you for a problem report.
[22 May 2008 2:17] Mark Callaghan
my_strntod_8bit also has code that uses pointer + 64k to guess at the end of the string.
[22 May 2008 17:16] Davi Arnaut
The problem seems to be one of precision when converting the number to a string as the
text protocol uses the "%.14g" format for conversion.
[22 May 2008 17:45] Davi Arnaut
Discussing Alexey Kopytov, for the case with tables, the problem seems to lie within
Field_double::val_str, which uses the "%-*.*g" when converting the double to string. The
problem is that by default it uses a DBL_DIG precision, but this has been fixed in 6.0 by
using dtoa which "automagically" chooses a precision.

This is a duplicate of Bug#21497.
[22 May 2008 18:16] Davi Arnaut
Also related to WL#2934
[22 May 2008 19:03] Mark Callaghan
The code that you mention in Field_double::val_str has not changed from 4.0.26 to 5.0.37:

  uint to_length=max(field_length, DOUBLE_TO_STRING_CONVERSION_BUFFER_SIZE);
  val_buffer->alloc(to_length);
  char *to=(char*) val_buffer->ptr();

  if (dec >= NOT_FIXED_DEC)
  {
    sprintf(to,"%-*.*g",(int) field_length,DBL_DIG,nr);
    to=strcend(to,' ');
  }
[22 May 2008 19:31] Mark Callaghan
The comments above about StrToDouble refer to code in my branch that consolidates common
code in field.cc that calls my_strntod.

I have changed my 5.0.37 branch so that my_strtod calls strtod() and I removed all code
from my_strtod. With this change, the behavior is what I expect and matches MySQL 4.0
behavior.

The problem occurs when the String literal "8.2..." is converted to a double on insert,
not when the double is converted to a string during the select.
[22 May 2008 19:49] Davi Arnaut
My investigation was done on 5.1. Out of curiosity, I took your example and compiled it
using my my_atof/my_atof:

int main()
{
  const char str[]= "8.8279417262832873e-139";
  double d = 8.8279417262832873e-139;
  double d2 = my_atof(str);
  printf("%s\n", str);
  printf("%.16e\n", d);
  printf("%.16e\n", d2);
  return 0;
}

davi@skynet:~/mysql-5.0-bugteam/strings$ ./str
8.8279417262832873e-139
8.8279417262832873e-139
8.8279417262832873e-139

Do you get different results?
[22 May 2008 19:57] Davi Arnaut
Taking a more closely look on my example and mysql's strtod.. yikes, you are right. There
is also a problem with strtod and it's subtle, will expand on this later.
[18 Aug 2008 15:38] Alexey Botchkov
Decided no to fix this in 5.0 as changes needed are too significant.
Besides that was fixed in 6.0.