Bug #114235 | DETERMINISTIC FUNCTIONS can return incorrect results | ||
---|---|---|---|
Submitted: | 6 Mar 5:06 | Modified: | 17 Jun 21:47 |
Reporter: | Yoseph Phillips | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Stored Routines | Severity: | S1 (Critical) |
Version: | 8.0.22, 8.0.36 | OS: | Any (Windows 2012 R2 Datacenter) |
Assigned to: | CPU Architecture: | x86 | |
Tags: | DETERMINISTIC function, regression |
[6 Mar 5:06]
Yoseph Phillips
[6 Mar 11:27]
MySQL Verification Team
Hi Mr. Philips, Thank you for your bug report. However, we are not able to repeat what you are reporting to us. Hence, what we need from you is a fully repeatable test case. That means that we must have the tables involved and the values that you are passing to that function. We can then run a multi-threaded test case, with exempli gratia 10 concurrent connections and feed those various values in different threads and run it for couple of hours and see if we get a wrong result. Thanks in advance.
[6 Mar 22:19]
Yoseph Phillips
We have not been able to come up with a repeatable test case yet. Each time when the issue starts it might be the case that records are getting added to either or both of those tables in the query. Further investigation indicates that a regression might have been introduced into MySQL 8.0.32 and is still there in MySQL 8.0.36. We have different versions of MySQL installed on thousands of environments and we seem to have a few reports of this issue on 8.0.32 and yesterday confirmed that this issue is in 8.0.36 as well. It seems related to how this function has been written. We have lots of DETERMINISTIC FUNCTIONS and this is the only one where we have seen this issue. The function itself and tables it is using have not changed in many years. It appears that the connections with the issues are storing an execution plan for this DETERMINISTIC FUNCTION and somehow one of the MySQL optimisations has transformed the query into a non equivalent query or something like that. We think we need to take the approach of working out exactly how MySQL is executing this query on these connections and which optimisations it is using. We can then compare this to all of these other connections which are executing this correctly. On a connection that was having an issue it seems that when we created a 2nd function with a different name but the exact same query that the issue went away for the original function as well. It started happening later again yesterday.
[7 Mar 6:09]
Yoseph Phillips
Trace of Failing
Attachment: failing.txt (text/plain), 7.20 KiB.
[7 Mar 6:10]
Yoseph Phillips
Trace of Working
Attachment: working.txt (text/plain), 18.08 KiB.
[7 Mar 6:14]
Yoseph Phillips
We have ran the following on many connections: SET optimizer_trace = "enabled=on"; SET optimizer_trace_max_mem_size = 1024 * 1024 * 16; SELECT udf1(26); SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE; SET optimizer_trace = "enabled=off"; I have attached the results. Whenever the results are wrong the output of this is always the same. Whenever the results are correct the output of this is always the same. The output of the Working and the Failing are completely different than each other so this should give us a big clue as to what the bug is.
[7 Mar 9:58]
Yoseph Phillips
Changed back to Critical. That output shows that for the failing connections they have completely lost the conditions. There is no condition_processing section and it is showing "original_condition": null. The output from SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE; is showing the query correctly in the QUERY column which narrows down the bug. It is executing the query as though it thought the query was: SELECT t1.col3 FROM table1 t1 CROSS JOIN table2 t2 ORDER BY t1.col1 DESC LIMIT 1 which also explains why it takes forever and returns the same result irrespective of the passed in parameter. This looks like a very serious regression.
[7 Mar 11:14]
MySQL Verification Team
Hi Mr. Phillips, Thank you for your feedback. However, we must inform you again that this is forum dealing ONLY with reports with a fully repeatable test cases. You have not provided such a test case, so we can not proceed .... We have also examined our entire bugs database and we have not found, with our very extensive and sophisticated searching tools, any bug that would be vaguely similar to what you are reporting. Since this is a forum dealing ONLY with fully repeatable test cases, we can not proceed in diagnosing the bug that you are reporting. Once we have a fully repeatable test case, we shall be happy to try to repeat it and then we would continue with it's processing.
[7 Mar 18:28]
MySQL Verification Team
Hello Yoseph, Please see the file I've uploaded, bug114235.sql It seems to repeat the bug you described. Run the sql script in a single thread with nobody else accessing the tables and let me know if that is the same issue you see. The function ends up returning 1010 for whatever input you give it! If you run the query directly it's fine. Regards, -- Shane Bester, MySQL Senior Principal Technical Support Engineer Oracle Corporation http://dev.mysql.com/
[7 Mar 18:53]
MySQL Verification Team
Notice the flush tables in the testcase. Even after server restart, see this, even on 8.0.27: mysql> \r Connection id: 8 Current database: test mysql> select udf1(73449) ; +-------------+ | udf1(73449) | +-------------+ | NULL | +-------------+ 1 row in set (0.02 sec) mysql> select udf1(73449) ; +-------------+ | udf1(73449) | +-------------+ | NULL | +-------------+ 1 row in set (0.00 sec) mysql> select udf1(73449) ; +-------------+ | udf1(73449) | +-------------+ | NULL | +-------------+ 1 row in set (0.00 sec) mysql> select udf1(73449) ; +-------------+ | udf1(73449) | +-------------+ | NULL | +-------------+ 1 row in set (0.00 sec) mysql> flush tables; Query OK, 0 rows affected (0.00 sec) mysql> select udf1(73449) ; +-------------+ | udf1(73449) | +-------------+ | 1010 | <<<<<<< ?????? +-------------+ 1 row in set (0.28 sec)
[7 Mar 19:45]
MySQL Verification Team
Yoseph, When the issue occurs again, please check if this temporarily resolves it: "create function f() returns int deterministic return 1;drop function f;" -- Shane Bester, MySQL Senior Principal Technical Support Engineer Oracle Corporation http://dev.mysql.com/
[7 Mar 19:59]
MySQL Verification Team
deterministic testcase
Attachment: bug114235_new.sql (application/octet-stream, text), 2.44 KiB.
[7 Mar 20:01]
MySQL Verification Team
this regression appears to start in 8.0.22. 8.0.11: ok 8.0.21: ok 8.0.22: broken 8.0.23: broken 8.0.24: broken 8.0.25: broken 8.0.26: broken 8.0.27: broken 8.0.36: broken 9.0.0 trunk : broken
[7 Mar 23:03]
Yoseph Phillips
Thanks Shane. I also added: SET optimizer_trace = "enabled=on"; SET optimizer_trace_max_mem_size = 1024 * 1024 * 16; SELECT udf1(73449); SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE; SET optimizer_trace = "enabled=off"; to your test case after each of the failing and working calls to udf1(73449); They produce exactly the same failing and working traces which I uploaded, so yes you have managed to come up with a repeatable test case. The creating and dropping of that function does temporarily solve the issue, but it doesn't take too long for the issue to start again. This might also explain some other strange things that we saw with clients on 8.0.25. Now we need to understand under what circumstances the conditions are ending up as null here. I am going to look through the 8.0.22 release notes again.
[7 Mar 23:10]
MySQL Verification Team
My best guess is this is another regression caused by WL #9384. Will await the devs analysis.
[12 Mar 11:13]
MySQL Verification Team
* Confirmed that the regression was caused by WL#9384 (commit ca2cd90bffb201b93a29974aa04d46806c466d72)
[7 May 5:21]
Yoseph Phillips
This regression still exists in 8.0.37
[9 May 10:05]
MySQL Verification Team
Thank you, Mr. Phillips.
[17 Jun 21:47]
Jon Stephens
Documented fix as follows in the MySQL 8.0.38, 8.4.1, and 9.0.0 changelogs: It was possible for a deterministic stored function to return an incorrect result when the function used JOIN ON inside the return statement. If the query needed to be reprepared due to a table metadata change such as FLUSH TABLES between two executions, the ON clause was sometimes lost. Closed.
[18 Jun 10:11]
MySQL Verification Team
Thank you, Jon.