Bug #112904 | Precision lost with decimals in JSON text that is IEEE 754 64-bit float | ||
---|---|---|---|
Submitted: | 31 Oct 2023 17:07 | Modified: | 2 Nov 2023 11:58 |
Reporter: | Kaiwang CHen (OCA) | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: JSON | Severity: | S3 (Non-critical) |
Version: | 5.7, 8.0 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[31 Oct 2023 17:07]
Kaiwang CHen
[1 Nov 2023 12:53]
MySQL Verification Team
Hi Mr. Chen, Thank you for your bug report. It looks very interesting, but there is something you should try. Have you tried typecasting the floating point constants and the expression itself as DOUBLE. You could have also used scientific notation. Also, do note that some of your constants have number of digits that are larger then 15. That is because even DOUBLE can not guarantee the sixteenth digits. Let us know how did it go with typecasting and decreasing significant digits to 15.
[2 Nov 2023 3:14]
Kaiwang CHen
Here are some examples: -- CAST(str AS JSON) -- A number in literal-string json is handled by ParseNumber() in the RapidJSON SAX parser -- Here is released code behavior -- Converted to json by Json_dom::parse(text,...) with kParseDefaultFlags -- got: {"x": 1.4334336} -- (gdb) 0100001101001001011101101000000001010110101011111000000000000000 -- (gdb) x /1gt &d -- 0x7fffac6f52a0: 0011111111110110111011110101100000010010000011111100110011111101 set json_full_precision = off; select cast('{\"x\":1.4334335999999999}' as json); -- Here is full precision in discussion -- Converted to json by Json_dom::parse(text,...) with kParseFullPrecisionFlag -- got: {"x": 1.4334335999999999} -- (gdb) x /1gt &d -- 0x7fffac6f5298: 0011111111110110111011110101100000010010000011111100110011111100 set json_full_precision = on; select cast('{\"x\":1.4334335999999999}' as json); -- JSON_SET(doc, path, value) -- A literal number is parsed by MySQL Lexer lex_one_token() and converted to target type then to json value -- The literal value is parsed as decimal (Item_decimal), converted by str2my_decimal() -- Converted to json by create_scalar<Json_decimal>(..., my_decimal) -- got: {"x": 1.4334335999999999} -- (gdb) p *r -- $5 = {<decimal_t> = {intg = 1, frac = 16, len = 9, sign = false, buf = 0x7fff3401c8ac}, foo1 = 123, buffer = {1, 433433599, 999999900, -- -1886417009, -1886417009, -1886417009, -1886417009, -1886417009, -1886417009}, foo2 = 123, static test_value = 123} select json_set('{}', "$.x", 1.4334335999999999); -- The literal value is parsed as double (Item_float), converted by my_strtod() -- Converted to json by create_scalar<Json_double>(..., double d) -- got: {"x": 1.4334335999999999} -- (gdb) x /1gt &d -- 0x7fffac6f54d8: 0011111111110110111011110101100000010010000011111100110011111100 select json_set('{}', "$.x", 1.4334335999999999e0); -- The value is a double converted from literal decimal, by my_decimal2double() -- Converted to json by create_scalar<Json_double>(..., double d) -- got: {"x": 1.4334335999999999} -- (gdb) x /1gt &d -- 0x7fffac6f54d8: 0011111111110110111011110101100000010010000011111100110011111100 select json_set('{}', "$.x", cast(1.4334335999999999 as double));
[2 Nov 2023 5:07]
Kaiwang CHen
Here are more examples: -- got: {"x": -0.00000000000005684341886080803} select json_set('{}', "$.x", -0.00000000000005684341886080803) -- got: {"x": -0.00000000000005684341886080803} select json_set('{}', "$.x", -0.00000000000005684341886080803e0); -- got: {"x": -0.00000000000005684341886080803} select json_set('{}', "$.x", -cast(0.00000000000005684341886080803 as double)); -- got: {"x": -0.00000000000005684341886080803} select json_set('{}', "$.x", cast(-0.00000000000005684341886080803 as double)); -- default precision (last digit becomes 4) -- got: {"x": -0.00000000000005684341886080804} select cast('{"x": -0.00000000000005684341886080803}' as json); -- full precision -- got: {"x": -0.00000000000005684341886080803} select cast('{"x": -0.00000000000005684341886080803}' as json); The number literals and conversions: (A) Literal -0.00000000000005684341886080803 It is parsed as Item_func_neg wrapping Item_decimal. The decimal is obtained by str2my_decimal(). (gdb) p *decimal_value $62 = {<decimal_t> = {intg = 1, frac = 29, len = 9, sign = false, buf = 0x7fff3401c8e4}, foo1 = 123, buffer = {0, 0, 56843, 418860808, 30000000, -1886417009, -1886417009, -1886417009, -1886417009}, foo2 = 123, static test_value = 123} When the whole expr is casted as double by my_decimal2double(), it is firstly serialized by decimal2string() to “-0.00000000000005684341886080803” then parsed by my_strtod(). (gdb) p *d $64 = {<decimal_t> = {intg = 1, frac = 29, len = 9, sign = true, buf = 0x7fffac6f535c}, foo1 = 123, buffer = {0, 0, 56843, 418860808, 30000000, -1886417009, -1886417009, -1886417009, -1886417009}, foo2 = 123, static test_value = 123} (gdb) p res $68 = -5.6843418860808027e-14 (gdb) x /1gt &res 0x7fffac6f5158: 1011110100110000000000000000000000000000000000000000000000000001 (B) Literal -0.00000000000005684341886080803e0 It is parsed as Item_float. The double value is obtained by my_strtod(). (gdb) p res $69 = 5.6843418860808027e-14 (gdb) x /1gt &res 0x7fffac6f3ab8: 0011110100110000000000000000000000000000000000000000000000000001 The JSON conversions: (A) Default precision, internal:: StrtodNormalPrecision (), precision lost. (gdb) p i64 $4 = 5684341886080803 (gdb) p d $5 = 5684341886080803 (gdb) x /1gt &d 0x7fffac6f52a0: 0100001100110100001100011110000011111010111001101101011100100011 (gdb) n 1720 if (d > (std::numeric_limits<double>::max)()) { (gdb) p d $6 = 5.684341886080804e-14 (gdb) x /1gt &d 0x7fffac6f52a0: 0011110100110000000000000000000000000000000000000000000000000010 (B) Full precision, internal::StrtodFullPrecision() (gdb) p d $7 = 5684341886080803 (gdb) p i64 $8 = 5684341886080803 (gdb) p d $9 = 5684341886080803 (gdb) x /1gt &d 0x7fffac6f5298: 0100001100110100001100011110000011111010111001101101011100100011 (gdb) n 1720 if (d > (std::numeric_limits<double>::max)()) { (gdb) p d $10 = 5.6843418860808027e-14 (gdb) x /1gt &d 0x7fffac6f5298: 0011110100110000000000000000000000000000000000000000000000000001 Output conversions: my_gcvt() with arg MY_GCVT_ARG_DOUBLE. (A) MySQL double (gdb) p x $12 = -5.6843418860808027e-14 (gdb) x /1gt &x 0x7fffac6f4a48: 1011110100110000000000000000000000000000000000000000000000000001 (gdb) p buffer $14 = 0x7fffac6f5960 "-0.", '0' <repeats 13 times>, "5684341886080803" (B) JSON double, obtained by normal precision (gdb) p x $24 = -5.684341886080804e-14 (gdb) x /1gt &x 0x7fffac6f4788: 1011110100110000000000000000000000000000000000000000000000000010 (gdb) p buffer.m_ptr $26 = 0x7fff34010340 "{\"x\": -0.", '0' <repeats 13 times>, "5684341886080804" (C) JSON double, obtained by full precision (gdb) p x $21 = -5.6843418860808027e-14 (gdb) x /1gt &x 0x7fffac6f4788: 1011110100110000000000000000000000000000000000000000000000000001 (gdb) p buffer.m_ptr $23 = 0x7fff3400ee70 "{\"x\": -0.", '0' <repeats 13 times>, "5684341886080803"
[2 Nov 2023 5:11]
Kaiwang CHen
In short, MySQL double and JSON double are parsed in their own ways, and printed in the same way. The JSON library with default flag works for most cases, but in certain cases it loses precision. To ensure consistency strictly, MySQL should implement RawNumber() by its own string to number functions rather than taking values from integers and double callbacks. However, it incurs more overhead, and the full precision flag works fine by observation. BTW, where is 15 significant digits defined? Is it cemment "double supports only 15 digits" in the code? Then is it possible to print only significant bits?
[2 Nov 2023 11:58]
MySQL Verification Team
Hi Mr. Chen, Sorry, but your report is not a bug. Simply, IEEE definition of double-precision floating point is quite clear. It is 16 digits that are supported, but 16th digit can be sometimes rounded. Anyway, the conversions are not done in our code, but in the FPU part of the CPU. MySQL does not do the conversion by itself. Code is generated that uses FPU's commands. Not a bug.