Bug #87756 JOINS and SUBQUERIES: huge performance difference between MySQL 5.5 and 5.6
Submitted: 14 Sep 2017 12:55 Modified: 2 Oct 2017 11:17
Reporter: Francesco Ferigo Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S5 (Performance)
Version:5.6 OS:Any
Assigned to: CPU Architecture:Any

[14 Sep 2017 12:55] Francesco Ferigo
Description:
The query is this: 

SELECT DISTINCT c.ID, c.car_type, cm1.car_value AS owner, cm2.car_value AS max_speed, cm3.car_value AS weight, cm4.car_value AS production_year

FROM cars AS c

INNER JOIN carsmeta AS cm ON (cm.car_id = c.ID)
INNER JOIN carsmeta AS cm1 ON (cm1.car_id = c.ID AND cm1.car_key = 'owner')
INNER JOIN carsmeta AS cm2 ON (cm2.car_id = c.ID AND cm2.car_key = 'max_speed' AND cm2.car_value > 200)
INNER JOIN carsmeta AS cm3 ON (cm3.car_id = c.ID AND cm3.car_key = 'weight' AND cm3.car_value > 3000)
INNER JOIN carsmeta AS cm4 ON (cm4.car_id = c.ID AND cm4.car_key = 'production_year' AND cm4.car_value > 2000)

WHERE

c.car_type = 'cabriolet'
AND c.ID IN (SELECT cm5.car_id FROM carsmeta AS cm5 WHERE cm5.car_key = 'color' AND cm5.car_value = 'black')
AND c.ID IN (SELECT cm6.car_id FROM carsmeta AS cm6 WHERE cm6.car_key = 'engine_volume' AND cm6.car_value > 3000)
AND c.ID IN (SELECT cm7.car_id FROM carsmeta AS cm7 WHERE cm7.car_key = 'width' AND cm7.car_value >= 2)
AND c.ID IN (SELECT cm8.car_id FROM carsmeta AS cm8 WHERE cm8.car_key = 'height' AND cm8.car_value >= 1)
AND c.ID IN (SELECT cm9.car_id FROM carsmeta AS cm9 WHERE cm9.car_key = 'fuel_consumption' AND cm9.car_value > 14)
;

There are some join and a few subqueries. 
These are the two EXPLAIN EXTENDED with MySQL 5.5 and 5.6,  as you can see the impact on performance is BIG, and the problem is that is very FAST on the older version.
The results are correct.

5.5.16 (JOINS AND SUBQUERIES) EXPLAIN EXTENDED
+----+--------------------+-------+----------------+------------------+---------+---------+---------------------------+------+----------+------------------------------------+
| id | select_type        | table | type           | possible_keys    | key     | key_len | ref                       | rows | filtered | Extra                              |
+----+--------------------+-------+----------------+------------------+---------+---------+---------------------------+------+----------+------------------------------------+
|  1 | PRIMARY            | cm1   | ref            | car_id,car_key   | car_key | 767     | const                     | 2999 |   100.00 | Using where; Using temporary       |
|  1 | PRIMARY            | c     | eq_ref         | PRIMARY,car_type | PRIMARY | 8       | test_for_joins.cm1.car_id |    1 |   100.00 | Using where                        |
|  1 | PRIMARY            | cm3   | ref            | car_id,car_key   | car_id  | 8       | test_for_joins.cm1.car_id |    6 |   100.00 | Using where                        |
|  1 | PRIMARY            | cm2   | ref            | car_id,car_key   | car_id  | 8       | test_for_joins.c.ID       |    6 |   100.00 | Using where                        |
|  1 | PRIMARY            | cm4   | ref            | car_id,car_key   | car_id  | 8       | test_for_joins.cm1.car_id |    6 |   100.00 | Using where                        |
|  1 | PRIMARY            | cm    | ref            | car_id           | car_id  | 8       | test_for_joins.cm4.car_id |    6 |   100.00 | Using where; Using index; Distinct |
|  6 | DEPENDENT SUBQUERY | cm9   | index_subquery | car_id,car_key   | car_id  | 8       | func                      |    6 |   100.00 | Using where                        |
|  5 | DEPENDENT SUBQUERY | cm8   | index_subquery | car_id,car_key   | car_id  | 8       | func                      |    6 |   100.00 | Using where                        |
|  4 | DEPENDENT SUBQUERY | cm7   | index_subquery | car_id,car_key   | car_id  | 8       | func                      |    6 |   100.00 | Using where                        |
|  3 | DEPENDENT SUBQUERY | cm6   | index_subquery | car_id,car_key   | car_id  | 8       | func                      |    6 |   100.00 | Using where                        |
|  2 | DEPENDENT SUBQUERY | cm5   | index_subquery | car_id,car_key   | car_id  | 8       | func                      |    6 |   100.00 | Using where                        |
+----+--------------------+-------+----------------+------------------+---------+---------+---------------------------+------+----------+------------------------------------+
11 rows in set, 1 warning (0.00 sec)

5.6.33 (JOINS AND SUBQUERIES) EXPLAIN EXTENDED
+----+--------------+-------------+--------+------------------+------------+---------+---------------------+------+----------+-----------------------------------------------------+
| id | select_type  | table       | type   | possible_keys    | key        | key_len | ref                 | rows | filtered | Extra                                               |
+----+--------------+-------------+--------+------------------+------------+---------+---------------------+------+----------+-----------------------------------------------------+
|  1 | SIMPLE       | c           | ref    | PRIMARY,car_type | car_type   | 766     | const               |  577 |   100.00 | Using index condition; Using where; Using temporary |
|  1 | SIMPLE       | <subquery2> | eq_ref | <auto_key>       | <auto_key> | 8       | test_for_joins.c.ID |    1 |   100.00 | NULL                                                |
|  1 | SIMPLE       | <subquery3> | eq_ref | <auto_key>       | <auto_key> | 8       | test_for_joins.c.ID |    1 |   100.00 | NULL                                                |
|  1 | SIMPLE       | <subquery4> | eq_ref | <auto_key>       | <auto_key> | 8       | test_for_joins.c.ID |    1 |   100.00 | NULL                                                |
|  1 | SIMPLE       | <subquery5> | eq_ref | <auto_key>       | <auto_key> | 8       | test_for_joins.c.ID |    1 |   100.00 | NULL                                                |
|  1 | SIMPLE       | <subquery6> | eq_ref | <auto_key>       | <auto_key> | 8       | test_for_joins.c.ID |    1 |   100.00 | NULL                                                |
|  1 | SIMPLE       | cm1         | ref    | car_id,car_key   | car_id     | 8       | test_for_joins.c.ID |    5 |   100.00 | Using where                                         |
|  1 | SIMPLE       | cm2         | ref    | car_id,car_key   | car_id     | 8       | test_for_joins.c.ID |    5 |   100.00 | Using where                                         |
|  1 | SIMPLE       | cm3         | ref    | car_id,car_key   | car_id     | 8       | test_for_joins.c.ID |    5 |   100.00 | Using where                                         |
|  1 | SIMPLE       | cm4         | ref    | car_id,car_key   | car_id     | 8       | test_for_joins.c.ID |    5 |   100.00 | Using where                                         |
|  1 | SIMPLE       | cm          | ref    | car_id           | car_id     | 8       | test_for_joins.c.ID |    5 |   100.00 | Using index; Distinct                               |
|  2 | MATERIALIZED | cm5         | ref    | car_id,car_key   | car_key    | 767     | const               | 2999 |   100.00 | Using where; Distinct                               |
|  3 | MATERIALIZED | cm6         | ref    | car_id,car_key   | car_key    | 767     | const               | 2999 |   100.00 | Using where; Distinct                               |
|  4 | MATERIALIZED | cm7         | ref    | car_id,car_key   | car_key    | 767     | const               | 2999 |   100.00 | Using where; Distinct                               |
|  5 | MATERIALIZED | cm8         | ref    | car_id,car_key   | car_key    | 767     | const               | 2999 |   100.00 | Using where; Distinct                               |
|  6 | MATERIALIZED | cm9         | ref    | car_id,car_key   | car_key    | 767     | const               | 2999 |   100.00 | Using where; Distinct                               |
+----+--------------+-------------+--------+------------------+------------+---------+---------------------+------+----------+-----------------------------------------------------+
16 rows in set, 1 warning (19.95 sec)

However, for completness, I tried to rewrite the query with JOINS only, and the performance is consistent among MySQL versions, unfortunately slow, about 20 seconds on the test database you find attached.

How to repeat:
Use the .sql attached and run the query.
Cars TABLE is 3000 rows, CARSMETA is 36000.
The query brings 3 rows as result.
[14 Sep 2017 12:56] Francesco Ferigo
CREATES database "test_for_joins" with 3000 rows for TABLE cars and 36000 rows for TABLE carsmeta

Attachment: test_for_joins.sql (application/octet-stream, text), 1.20 MiB.

[14 Sep 2017 14:01] Øystein Grøvlen
IIUC, the performance difference is visible when running EXPLAIN, not only when executing the query.  In other words, it is the optimization time that has increassed.  This is probably because due to semijoin that was introduced in MySQL 5.6.  The optimizer now need to find the best join order for all 11 tables, and not just the 6 tables of the main query.  

To reduce join optimization time, set optimizer_search_depth to 0.  That may in some cases result in the optimizer deciding on a less optimal plan.  However, in this case it seems that the resulting plan will be very similar to the optimal plan.

Alternatively, one can turn set optimizer_switch='semijoin=off' to make the optimizer handle subqueries the same way as in 5.5.
[14 Sep 2017 14:29] Francesco Ferigo
This other query is SLOW on both MySQL versions:

SELECT DISTINCT c.ID, c.car_type, cm1.car_value AS owner, cm2.car_value AS max_speed, cm3.car_value AS weight, cm4.car_value AS production_year

FROM cars AS c

INNER JOIN carsmeta AS cm ON (cm.car_id = c.ID)
INNER JOIN carsmeta AS cm1 ON (cm1.car_id = c.ID AND cm1.car_key = 'owner')
INNER JOIN carsmeta AS cm2 ON (cm2.car_id = c.ID AND cm2.car_key = 'max_speed' AND cm2.car_value > 200)
INNER JOIN carsmeta AS cm3 ON (cm3.car_id = c.ID AND cm3.car_key = 'weight' AND cm3.car_value > 3000)
INNER JOIN carsmeta AS cm4 ON (cm4.car_id = c.ID AND cm4.car_key = 'production_year' AND cm4.car_value > 2000)

INNER JOIN carsmeta AS cm5 ON (cm5.car_id = c.ID AND cm5.car_key = 'color' AND cm5.car_value = 'black')
INNER JOIN carsmeta AS cm6 ON (cm6.car_id = c.ID AND cm6.car_key = 'engine_volume' AND cm6.car_value > 3000)
INNER JOIN carsmeta AS cm7 ON (cm7.car_id = c.ID AND cm7.car_key = 'width' AND cm7.car_value >= 2)
INNER JOIN carsmeta AS cm8 ON (cm8.car_id = c.ID AND cm8.car_key = 'height' AND cm8.car_value >= 1)
INNER JOIN carsmeta AS cm9 ON (cm9.car_id = c.ID AND cm9.car_key = 'fuel_consumption' AND cm9.car_value > 14)

WHERE c.car_type = 'cabriolet';

Isn't it supposed to be at least as fast as the one with SUBQUERIES?
[14 Sep 2017 21:51] Francesco Ferigo
I can tell that setting semijoin to OFF doesn't have any impact on performance on MySQL 5.6 (I mean on the queries we are talking about) while setting optimizer_search_depth to 0 improves performance of the query.
Also converting INNER JOIN to STRAIGHT_JOIN improves performance.
[15 Sep 2017 6:18] Øystein Grøvlen
As I said, what takes time is to find the optimal join order. The rewritten query will have the same number of tables to join both in 5.5 and 5.6.  Hence, the optimization time will be about the same in both versions.  In 5.6, the semijoin feature will rewrite IN-subqueries to join, and the optimization time will be similar to the query without subqueries.

That setting optimizer_search_depth improves performance confirms that the problem is long optimization time.  I am surprised that turning off semijoin has no effect.  That is not what I see when trying your test database, but I have only tested it with MySQL 5.7.  While the original query with default settings also is slower on my machine with MySQL 5.7, it is only 4 times slower and finishes in around 0.35 seconds.  Maybe you should consider to upgrade to 5.7?
[15 Sep 2017 13:45] MySQL Verification Team
Hi Mr. Ferigo,

Please read carefully what our developer has written to you. Please try to set both, optimizer_search_depth to 0 and to set optimizer_switch='semijoin=off'. 

Do use STRAIGHT_JOIN wherever you know the exact 1 -> N relationship.

Optimisation can not work out-of-the-box on all queries, so the optimiser requires hints. All current optimisers are developed and refined in all RDBMS products.
[17 Sep 2017 22:54] Francesco Ferigo
I did a few tests on MySQL 5.7.

I can say that it behaves as 5.5 by setting semi join = off for the query with subqueries.
I get the same result if I put optimizer_search_depth = 0;

For the rewritten query with only JOINS I get faster result without changing any setting (1 second against 20 seconds of MySQL 5.6).
Then semi join setting has no impact on the query (as you  said), instead setting  optimizer_search_depth to 0  improves performance (from 1 second to 0.35 seconds)

Many thanks, I hope that my provider will allow me to update to 5.7.
[2 Oct 2017 11:17] MySQL Verification Team
You are truly welcome.