| Bug #94672 | Optimizer trace uses imprecise routines to print floating point numbers | ||
|---|---|---|---|
| Submitted: | 15 Mar 2019 12:05 | Modified: | 4 Nov 2020 11:43 |
| Reporter: | Alexey Kopytov | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
| Version: | 5.6, 5.7, 8.0 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[15 Mar 2019 14:08]
MySQL Verification Team
Hi Kaamos, How are you ??? Thank you for your bug report. I fully agree with your analysis. Verified as reported. Have a nice day.
[3 Nov 2020 16:23]
Alexey Kopytov
This has been fix in 8.0 with the following commit: https://github.com/mysql/mysql-server/commit/ab4919602ffe5543b4ebbab4d79c008c023be1c6
[4 Nov 2020 6:06]
Kaiwang CHen
Why did you use FLT_DIG instead of DBL_DIG to print a double number?
--- a/sql/opt_trace.cc
+++ b/sql/opt_trace.cc
@@ -29,7 +29,6 @@
#include "my_pointer_arithmetic.h"
#include "my_sys.h"
#include "mysql/psi/psi_base.h"
-#include "mysql/service_my_snprintf.h"
#include "mysqld_error.h"
#include "prealloced_array.h"
#include "sql/current_thd.h"
@@ -353,7 +352,7 @@ Opt_trace_struct& Opt_trace_struct::do_add(const char *key, double val)
{
DBUG_ASSERT(started);
char buf[32]; // 32 is enough for digits of a double
- my_snprintf(buf, sizeof(buf), "%g", val);
+ my_gcvt(val, MY_GCVT_ARG_DOUBLE, FLT_DIG, buf, nullptr);
DBUG_PRINT("opt", ("%s: %s", key, buf));
stmt->add(key, buf, strlen(buf), false, false);
return *this;
@@ -391,7 +390,7 @@ Opt_trace_struct& Opt_trace_struct::do_add(const char *key, Item *item)
Opt_trace_struct& Opt_trace_struct::do_add(const char *key, const Cost_estimate &value)
{
char buf[32]; // 32 is enough for digits of a double
- my_snprintf(buf, sizeof(buf), "%g", value.total_cost());
+ my_gcvt(value.total_cost(), MY_GCVT_ARG_DOUBLE, FLT_DIG, buf, nullptr);
DBUG_PRINT("opt", ("%s: %s", key, buf));
stmt->add(key, buf, strlen(buf), false, false);
return *this;
@@ -663,7 +662,7 @@ void Opt_trace_stmt::next_line()
[4 Nov 2020 7:02]
Alexey Kopytov
I was merely suggesting to use my_gcvt() instead of my_snprintf(). But the commit that implemented that change looks wrong to me. Neither FLT_DIG nor DBL_DIG make any sense as a field width, simply because both constants define limits in terms of decimal digits, and the `width` argument to my_gcvt() defines the field width in characters, which may include sign, dot, and exponent. It's not the only wrong thing about the commit mentioned here and in bug#101430. I'll report them separately.
[4 Nov 2020 11:43]
Alexey Kopytov
Reported a follow-up bug#101457.
[4 Nov 2020 13:47]
MySQL Verification Team
Thanks a lot, Kaamos.
[12 Nov 2020 22:14]
Jon Stephens
See also BUG#101457.
[13 Nov 2020 12:53]
MySQL Verification Team
Thanks a lot, Jon ......

Description: When the optimizer trace needs to print a floating point number (e.g. display a plan cost), it uses the following code in sql/opt_trace.cc: --- Opt_trace_struct& Opt_trace_struct::do_add(const char *key, const Cost_estimate &value) { char buf[32]; // 32 is enough for digits of a double my_snprintf(buf, sizeof(buf), "%g", value.total_cost()); DBUG_PRINT("opt", ("%s: %s", key, buf)); stmt->add(key, buf, strlen(buf), false, false); return *this; } --- (there's a similar code in Opt_trace_struct::do_add(const char *key, double val)) Using sprintf("%g") to convert doubles to decimal representation is imprecise and may result in the same representation for different binary numbers. Here's a fragment of a trace demonstrating the problem. One could guess a covering index scan was chosen over a range scan due to a lower cost, but it is not obvious at all from just looking at the cost values (they are both "2.06e6"): --- "best_covering_index_scan": { "index": "k_1", "cost": 2.06e6, "chosen": true }, "setup_range_conditions": [ ], "group_index_range": { "chosen": false, "cause": "not_group_by_or_distinct" }, "analyzing_range_alternatives": { "range_scan_alternatives": [ { "index": "PRIMARY", "ranges": [ "1798955 < id" ], "index_dives_for_eq_ranges": true, "rowid_ordered": true, "using_mrr": false, "index_only": true, "rows": 8233926, "cost": 2.06e6, "chosen": false, "cause": "cost" } ], --- The server code has a better routine to perform such conversion, it is called my_gcvt() and is located in strings/dtoa.c. Here's how the same trace looks like after replacing my_snprintf() with my_cvt(). It is, in my opinion, more self-explaining than the previous one: --- "best_covering_index_scan": { "index": "k_1", "cost": 2058480.5263157894, "chosen": true }, "setup_range_conditions": [ ], "group_index_range": { "chosen": false, "cause": "not_group_by_or_distinct" }, "analyzing_range_alternatives": { "range_scan_alternatives": [ { "index": "PRIMARY", "ranges": [ "1798955 < id" ], "index_dives_for_eq_ranges": true, "rowid_ordered": true, "using_mrr": false, "index_only": true, "rows": 8233926, "cost": 2058482.5100000002, "chosen": false, "cause": "cost" } ], --- How to repeat: Code analysis. Suggested fix: Use my_gcvt() instead of my_snprintf() to convert floating point numbers to strings.