Bug #99684 | incorrect determination of big test | ||
---|---|---|---|
Submitted: | 25 May 2020 12:19 | Modified: | 28 May 2020 13:22 |
Reporter: | andy zhang | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.6, 5.7, 8.0 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | big test, Optimizer |
[25 May 2020 12:19]
andy zhang
[27 May 2020 13:11]
MySQL Verification Team
Hi Mr. zhang, Thank you for your bug report. best_read is indeed a limit that has to be surpassed in order that a query is denoted as big SELECT. Hence, why do you think that best_rowcount should be used. This variable, best_rowcount, does not take into the account the width of the columns and rows that should be read. Hence, we would like to hear from you a full justification of why you consider your criterium a much better one. Thanks in advance.
[27 May 2020 15:34]
andy zhang
Sinisa, Here is the what reference guide says on the variable max_join_size and sql_big_selects: max_join_size: Do not permit statements that probably need to examine more than max_join_size rows (for single- table statements) or row combinations (for multiple-table statements) or that are likely to do more than max_join_size disk seeks. sql_big_selects: If set to OFF, MySQL aborts SELECT statements that are likely to take a very long time to execute (that is, statements for which the optimizer estimates that the number of examined rows exceeds the value of max_join_size). Thanks!
[28 May 2020 12:01]
MySQL Verification Team
HI Mr. zhang, Yes, the documentation seems to be wrong. If you agree, I can verify this as a documentation bug.
[28 May 2020 12:42]
andy zhang
That's ok to me.
[28 May 2020 13:22]
MySQL Verification Team
This is now verified as a documentation enhancement bug.
[7 Jun 2022 14:54]
Guilhem Bichot
There is still a problem. The code reads if (!(thd->variables.option_bits & OPTION_BIG_SELECTS) && best_read > (double)thd->variables.max_join_size &&. <===== here is the problem line !thd->lex->is_explain()) { /* purecov: inspected */ my_error(ER_TOO_BIG_SELECT, MYF(0)); error = -1; DBUG_RETURN(1); } so it compares max_join_size to JOIN::best_read, which is an estimate of cost (not of rows), made by the optimizer. Then the doc says: https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_max_join_size "Do not permit statements that probably need to examine more than max_join_size rows (for single-table statements) or row combinations (for multiple-table statements) or that are likely to do more than max_join_size disk seeks" The part about disk seeks might be consistent with the fact that best_read is a cost, but the part about "more than max_join_size rows" remains false: best_read is not a count of rows, and max_join_size is never compared with a number of rows in code. By the way, the error message only mentions rows, which is wrong: ER_TOO_BIG_SELECT 42000 eng "The SELECT would examine more than MAX_JOIN_SIZE rows; check your WHERE and use SET SQL_BIG_SELECTS=1 or SET MAX_JOIN_SIZE=# if the SELECT is okay" To give an example: I make a 5-table join, all tables have 6 rows. The cost is 935 (as seen in the optimizer trace). When I set max_join_size to 940, it passes; to 930 it fails. It proves again that max_join_size is compared to the cost. The number of rows is 6**5, roughly 7700 for this query, max_join_size is not compared to it.
[8 Jun 2022 12:02]
MySQL Verification Team
HI Guilhem, mon ami, Thank you very much for your comment. We agree with your analysis , which is why we change the category of this report to "Optimiser". Simply, comparison is made between the different types of values.