Bug #87756 | JOINS and SUBQUERIES: huge performance difference between MySQL 5.5 and 5.6 | ||
---|---|---|---|
Submitted: | 14 Sep 2017 12:55 | Modified: | 2 Oct 2017 11:17 |
Reporter: | Francesco Ferigo | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server | Severity: | S5 (Performance) |
Version: | 5.6 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[14 Sep 2017 12:55]
Francesco Ferigo
[14 Sep 2017 12:56]
Francesco Ferigo
CREATES database "test_for_joins" with 3000 rows for TABLE cars and 36000 rows for TABLE carsmeta
Attachment: test_for_joins.sql (application/octet-stream, text), 1.20 MiB.
[14 Sep 2017 14:01]
Øystein Grøvlen
IIUC, the performance difference is visible when running EXPLAIN, not only when executing the query. In other words, it is the optimization time that has increassed. This is probably because due to semijoin that was introduced in MySQL 5.6. The optimizer now need to find the best join order for all 11 tables, and not just the 6 tables of the main query. To reduce join optimization time, set optimizer_search_depth to 0. That may in some cases result in the optimizer deciding on a less optimal plan. However, in this case it seems that the resulting plan will be very similar to the optimal plan. Alternatively, one can turn set optimizer_switch='semijoin=off' to make the optimizer handle subqueries the same way as in 5.5.
[14 Sep 2017 14:29]
Francesco Ferigo
This other query is SLOW on both MySQL versions: SELECT DISTINCT c.ID, c.car_type, cm1.car_value AS owner, cm2.car_value AS max_speed, cm3.car_value AS weight, cm4.car_value AS production_year FROM cars AS c INNER JOIN carsmeta AS cm ON (cm.car_id = c.ID) INNER JOIN carsmeta AS cm1 ON (cm1.car_id = c.ID AND cm1.car_key = 'owner') INNER JOIN carsmeta AS cm2 ON (cm2.car_id = c.ID AND cm2.car_key = 'max_speed' AND cm2.car_value > 200) INNER JOIN carsmeta AS cm3 ON (cm3.car_id = c.ID AND cm3.car_key = 'weight' AND cm3.car_value > 3000) INNER JOIN carsmeta AS cm4 ON (cm4.car_id = c.ID AND cm4.car_key = 'production_year' AND cm4.car_value > 2000) INNER JOIN carsmeta AS cm5 ON (cm5.car_id = c.ID AND cm5.car_key = 'color' AND cm5.car_value = 'black') INNER JOIN carsmeta AS cm6 ON (cm6.car_id = c.ID AND cm6.car_key = 'engine_volume' AND cm6.car_value > 3000) INNER JOIN carsmeta AS cm7 ON (cm7.car_id = c.ID AND cm7.car_key = 'width' AND cm7.car_value >= 2) INNER JOIN carsmeta AS cm8 ON (cm8.car_id = c.ID AND cm8.car_key = 'height' AND cm8.car_value >= 1) INNER JOIN carsmeta AS cm9 ON (cm9.car_id = c.ID AND cm9.car_key = 'fuel_consumption' AND cm9.car_value > 14) WHERE c.car_type = 'cabriolet'; Isn't it supposed to be at least as fast as the one with SUBQUERIES?
[14 Sep 2017 21:51]
Francesco Ferigo
I can tell that setting semijoin to OFF doesn't have any impact on performance on MySQL 5.6 (I mean on the queries we are talking about) while setting optimizer_search_depth to 0 improves performance of the query. Also converting INNER JOIN to STRAIGHT_JOIN improves performance.
[15 Sep 2017 6:18]
Øystein Grøvlen
As I said, what takes time is to find the optimal join order. The rewritten query will have the same number of tables to join both in 5.5 and 5.6. Hence, the optimization time will be about the same in both versions. In 5.6, the semijoin feature will rewrite IN-subqueries to join, and the optimization time will be similar to the query without subqueries. That setting optimizer_search_depth improves performance confirms that the problem is long optimization time. I am surprised that turning off semijoin has no effect. That is not what I see when trying your test database, but I have only tested it with MySQL 5.7. While the original query with default settings also is slower on my machine with MySQL 5.7, it is only 4 times slower and finishes in around 0.35 seconds. Maybe you should consider to upgrade to 5.7?
[15 Sep 2017 13:45]
MySQL Verification Team
Hi Mr. Ferigo, Please read carefully what our developer has written to you. Please try to set both, optimizer_search_depth to 0 and to set optimizer_switch='semijoin=off'. Do use STRAIGHT_JOIN wherever you know the exact 1 -> N relationship. Optimisation can not work out-of-the-box on all queries, so the optimiser requires hints. All current optimisers are developed and refined in all RDBMS products.
[17 Sep 2017 22:54]
Francesco Ferigo
I did a few tests on MySQL 5.7. I can say that it behaves as 5.5 by setting semi join = off for the query with subqueries. I get the same result if I put optimizer_search_depth = 0; For the rewritten query with only JOINS I get faster result without changing any setting (1 second against 20 seconds of MySQL 5.6). Then semi join setting has no impact on the query (as you said), instead setting optimizer_search_depth to 0 improves performance (from 1 second to 0.35 seconds) Many thanks, I hope that my provider will allow me to update to 5.7.
[2 Oct 2017 11:17]
MySQL Verification Team
You are truly welcome.