Bug #111669 | Simple Query Times Out | ||
---|---|---|---|
Submitted: | 6 Jul 2023 0:47 | Modified: | 12 Jul 2023 13:04 |
Reporter: | David Allen | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
Version: | 8.0.33 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | query |
[6 Jul 2023 0:47]
David Allen
[6 Jul 2023 13:21]
MySQL Verification Team
Hi Mr. Allen, Thank you for your bug report. However, this is not a bug. First of all, both of your queries are identical. However, this is irrelevant. When you replace AND with OR, then second condition is not evaluated at all. That means that this query is your bottleneck, which is not a bug in MySQL. Bottleneck is in design of your schema and your queries. The problematic query is: SELECT DISTINCT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE MATCH(phw.work_title) AGAINST('(software)' IN boolean MODE) First of all, DISTINCT is a VERY , VERY expensive operation and you do not need it for the IN (....) nested queries. Second, we do not see why do you need to search in BOOLEAN mode, when you do not use the right syntax. Run just the above query and you will see that it is a bottleneck. Optimal design of schemas, tables, tuples and SQL statements is a separate science. Also, this is a forum for bug reports (with fully repeatable test cases) and your report is not of that kind. This is not a forum for the free support. Not a bug.
[6 Jul 2023 13:46]
David Allen
Your feedback is incorrect. Each query executed on its own takes only 50ms or less. The queries are not identical, but similar. Removing DISTINCT has no impact on the query, either as a subquery or as an independent execution. It's as if you didn't even read what I posted. I already said both queries run <50ms independently, but even if I remove DISTICT and BOOLEAN, the problem persists. I'm not trying to get free support. Something is very wrong with the MySQL engine as it tries to handle this. I can reproduce this 100% of the time, and it makes absolutely no sense. To be clear: -- This takes 20ms, 4800 records SELECT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE MATCH(phw.work_summary) AGAINST('finance'); -- This takes 20ms, 24000 records SELECT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE MATCH(phw.work_title) AGAINST('software'); -- This takes 700ms, returns 14000 records SELECT pm.ID FROM person_main pm WHERE pm.ID IN ( SELECT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE phw.ID < 50000 ) AND pm.ID IN ( SELECT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE phw.ID > 10000 ); -- This times out at over 60000ms, should return at max 4800 records SELECT pm.ID FROM person_main pm WHERE pm.ID IN ( SELECT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE MATCH(phw.work_summary) AGAINST('finance') ) AND pm.ID IN ( SELECT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE MATCH(phw.work_title) AGAINST('software') ) ; SQL EXPLAIN SIMPLE SIMPLE SIMPLE fulltext fulltext eq_ref Using where; Ft_hints: sorted; Start temporary Using where; Ft_hints: no_ranking Using index; End temporary As seen above, if I replace the MATCH AGAINST calls with others that return even more IDs that crossover for the AND, it executes fine, albeit a bit slow (700ms).
[6 Jul 2023 13:56]
MySQL Verification Team
HI Mr. Allen, You have advanced in your evaluation of the query. You have come to the conclusion that 4800 rows should be compared with 24000 rows. That makes approximately 115 millions of comparisons. You could simplify your complex query by eliminating nested queries.
[6 Jul 2023 14:21]
David Allen
What you do not understand is this example is taken from a complex front-end system that allows users of an enterprise-scale CRM to search for records based on a combination of values they can choose. This example is an extraction from that system. Ultimately, it should work just fine (albeit a bit slow). But it's not working - at all - it's timing out at over 60000ms. It's hanging. As such, there's something wrong with the engine. Users should be able to search the table based on these two matching values. The record count and details provided show we're dealing with VERY small numbers. I formally request another developer review this report. You have already made incorrect statements (showing you don't really understand what's going on), and your most recent comment adds nothing to the discussion.
[6 Jul 2023 14:28]
MySQL Verification Team
HI, We have provided you with sufficient info for you to optimise your queries. Also, you have other tools in MySQL to help you with optimisation, like ANALYZE variant of the DESCRIBE of the queries. Also, the timeouts are settable in MySQL. If you wish us to help you with the optimisation of your queries, please read carefully the following informations.
[6 Jul 2023 14:29]
MySQL Verification Team
We're sorry, but the bug system is not the appropriate forum for asking help on using MySQL products. Your problem is not the result of a bug. For details on getting support for MySQL products see http://www.mysql.com/support/ You can also check our forums (free) at http://forums.mysql.com/ Thank you for your interest in MySQL.
[6 Jul 2023 14:48]
David Allen
I officially request a different developer review this. Your response is flat-out wrong. This is easily proven with: -- This takes 24ms, returns 13183 records SELECT pm.ID FROM person_main pm WHERE pm.ID IN ( -- 49838 records SELECT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE phw.ID < 50000 ) AND pm.ID IN ( -- 299679 records SELECT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE phw.ID > 10000 );
[6 Jul 2023 17:19]
MySQL Verification Team
Hi, We feel a need to explain why this is not a bug. Let us take, for example, that query that takes over 60 seconds to execute with 8.0.33. If the IDENTICAL query on IDENTICAL data , with 8.0.25 , 8.0.20 or 8.0.10, would take 5 second to execute, then it would DEFINITELY be a bug. In all cases, this query should have been the only one executing on the MySQL server. Then, this forum would be the right place to proceed ..... Then, we would be obliged to find the cause of the slowdown. Then, this would be a serious candidate for bug processing. But, you have not provided evidence of a bug, since earlier releases of 8.0 have probably very similar performance. If there would be a significant drop in performance, then,, we would require that you supply us with all tables with all data, with analysing and profiling informations. Upon the receipt, we would try to reproduce the behaviour and if that would be successful, we would verify this bug report. It would then enter the queue for the performance bugs, which would get the priority based on the slowdown experienced. We do not even see in your report performances with earlier releases. We have not received any profiling info, nor the info from post-factum analyze of the query execution. To obtain the last info, you should have to increase the timeout that strikes at 60 seconds. That timeout is probably defined in your configuration. The mere fact that you have queries that run much faster, does not prove that this is a bug. In some cases, one query requires 100 comparisons, while some other 100 millions. That does not make it a bug. That only means that your schema or query is not optimally designed. The mere fact that you have designed a slow query, does not make it our bug. It could be a consequence of not fully normalised schema or bad query design. That is not a report for this forum. The appropriate forums for this report are listed in our previous comment.
[6 Jul 2023 17:23]
MySQL Verification Team
Hi, We would like to urge you to consider our commercial support, as described in http://www.mysql.com/support/. MySQL 8.0 has many new features that would make possible to rewrite the problematic query so that it runs faster.
[7 Jul 2023 5:50]
MySQL Verification Team
Hello David, Thank you for the bug report and valuable feedback. I discussed your issue with our lead developer, and in order to proceed further i.e., to investigate your issue we need explain plans with some data(logical dump of the tables) to reproduce the problem at our end. It would be good if you can also share the configuration from the instance if it is not on default settings. If the data you need to attach is more than 3MB, you should create a compressed archive of the data and a README file that describes the data with a filename that includes the bug number (recommended filename: mysql-bug-data-111669.zip) and upload one to sftp.oracle.co. More details are in the "Files" section of this bug page. Sincerely, Umesh
[7 Jul 2023 14:47]
David Allen
Thank you, Umesh. This issue is taking place on an established CRM database system with many records, but not "too many". Please give me a detailed list of the data you are looking for and how you would like me to acquire/collect it and I will do my best to ZIP it up and get it to you. I appreciate somebody finally taking this report seriously. As mentioned, I believe I've stumbled upon a real issue with the engine. BTW if I change the query to LIKE, it returns in 2000ms. I think this emphasizes the fact there really is something wrong. SELECT pm.ID FROM person_main pm WHERE pm.ID IN ( SELECT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE phw.work_summary LIKE '%finance%' ) AND pm.ID IN ( SELECT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE phw.work_title LIKE '%software%' ) ;
[7 Jul 2023 15:12]
MySQL Verification Team
Hello David, To begin with, we need (a) Logical dump of the tables involved in both the queries(mysqldump client utility performs logical backups, producing a set of SQL statements that can be executed to reproduce the original database object definitions and table data). If you already have a backup and it is enough to demonstrate the issue then please share it instead of extracting from the live/prod environment which may impact its usage. (b) Share the configuration file(my.cnf or my.ini) from the instance if it is not on default settings. (c) Lastly, EXPLAIN plan of the queries: E.g. EXPLAIN SELECT pm.ID FROM person_main pm WHERE pm.ID IN ( SELECT DISTINCT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE MATCH(phw.work_summary) AGAINST('(.net sql) +finance' IN boolean MODE) ) AND pm.ID IN ( SELECT DISTINCT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE MATCH(phw.work_title) AGAINST('(software)' IN boolean MODE) ); For more details on explain, please see https://dev.mysql.com/doc/refman/8.0/en/explain.html Once we have these details, we will try to reproduce at our end to see what is the issue. "Files" tab of this bug page explains you on how to upload requested data to our FTP server. This data would be strictly confidential and would be used for the purpose of verifying the issue and later destroyed. Thank you. regards, Umesh
[8 Jul 2023 13:36]
David Allen
I have uploaded test_db_bug_111669.zip. It has the minimized database with two tables. I was able to reproduce the issue with these two tables and the query. For clarity, you can first run these: SELECT COUNT(phw.person_main_ref_id) AS count_phw_summary_like FROM person_history_work phw WHERE phw.work_summary LIKE '%finance%'; SELECT COUNT(phw.person_main_ref_id) AS count_phw_summmary_match FROM person_history_work phw WHERE MATCH(phw.work_summary) AGAINST('finance'); SELECT COUNT(phw.person_main_ref_id) AS count_phw_title_like FROM person_history_work phw WHERE phw.work_title LIKE '%software%'; SELECT COUNT(phw.person_main_ref_id) AS count_phw_title_match FROM person_history_work phw WHERE MATCH(phw.work_title) AGAINST ('software'); And to reproduce the issue, you can run this: -- This times out at over 60000ms, should return at max 4800 records SELECT pm.ID FROM person_main pm WHERE pm.ID IN ( SELECT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE MATCH(phw.work_summary) AGAINST('finance') ) AND pm.ID IN ( SELECT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE MATCH(phw.work_title) AGAINST('software') ); And to reaffirm a problem, this compares nearly the exact same records as above in roughly 2000ms. -- This takes 2000ms and compares roughly the same record count as MATCH. SELECT pm.ID FROM person_main pm WHERE pm.ID IN ( SELECT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE phw.work_summary LIKE ('%finance%') ) AND pm.ID IN ( SELECT phw.person_main_ref_id AS ID FROM person_history_work phw WHERE phw.work_title LIKE ('%software%') ); I am attaching a screenshot of the EXPLAIN.
[8 Jul 2023 13:36]
David Allen
SQL explain of query in question
Attachment: sql_explain.png (image/png, text), 19.28 KiB.
[8 Jul 2023 14:34]
MySQL Verification Team
Thank you David for the details. I'll get back to you next week if anything further needed. regards, Umesh
[8 Jul 2023 14:36]
David Allen
Thank you for taking the time to investigate this issue, Umesh. I appreciate it. Have a great weekend!
[10 Jul 2023 11:03]
MySQL Verification Team
Thank you, David Allen. I can confirm after discussing with lead developer(optimizer team) that this is indeed an issue. This report is fully verified now. Until this issue is fixed, developer has provided a workaround(i.e. set optimizer_switch="semijoin=off";) which we confirmed at our end. More details are attached in the test details which I'll join shortly. Thank you. Sincerely, Umesh
[10 Jul 2023 11:07]
MySQL Verification Team
Test results - 8.0.11+, 5.7.42
Attachment: 111669_8.0.11_8.0.33_5.7.42.log (application/octet-stream, text), 75.78 KiB.
[10 Jul 2023 11:10]
MySQL Verification Team
# Issue seen in 5.7.42, 8.0.11+
[12 Jul 2023 13:04]
David Allen
Hey Umesh, I want to extend my sincere thanks for investigating and confirming this issue. I hope you're able to speak with the managers of support and properly address the way the other individual(s) in the ticket mishandled things. If I hadn't pushed so hard, they would have just blown this issue off. I also confirmed the workaround you provided works. The query now returns in roughly 1000ms. Thanks again, and I look forward to seeing the fix for this issue in a future version. Have a great week!