| Bug #83323 | Optimizer chooses wrong plan when joining 2 tables | ||
|---|---|---|---|
| Submitted: | 10 Oct 2016 18:02 | Modified: | 24 Apr 2017 21:32 |
| Reporter: | Alexander Rubin | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
| Version: | 5.7, 5.6, 8.0.1 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | Optimizer | ||
[10 Oct 2016 18:06]
Alexander Rubin
optimizer trace
Attachment: trace2.json (application/json, text), 24.50 KiB.
[12 Oct 2016 17:19]
MySQL Verification Team
Alexander,
Thank you for your bug report. I managed to repeat it:
./client/mysql sinbas -e "select count(*) from events"
+----------+
| count(*) |
+----------+
| 557090 |
+----------+
mysql> update events set event_date = now() - interval id second;
Query OK, 557090 rows affected (7 min 26.22 sec)
Rows matched: 557090 Changed: 557090 Warnings: 0
mysql>
mysql>
mysql>
mysql> update events set profile_id = 1;
Query OK, 557073 rows affected (7 min 46.32 sec)
Rows matched: 557090 Changed: 557073 Warnings: 0
mysql>
mysql>
mysql>
mysql> update events set profile_id = 2 where mod(id, 1000) = 1;
2016-10-12T17:05:41.405017Z 16419 [Warning] DEBUG: Full table scan on table events. Reason: 'quick == 0' Records2: 557397 Records3: 557397 Optimizer_flag: 523775 Query: update events set profile_id = 2 where mod(id, 1000) = 1
Query OK, 557 rows affected (44.08 sec)
Rows matched: 557 Changed: 557 Warnings: 0
mysql> update events set profile_id = 3 where mod(id, 100) = 1;
2016-10-12T17:08:09.129270Z 16419 [Warning] DEBUG: Full table scan on table events. Reason: 'quick == 0' Records2: 557397 Records3: 557397 Optimizer_flag: 523775 Query: update events set profile_id = 3 where mod(id, 100) = 1
Query OK, 5571 rows affected (42.44 sec)
Rows matched: 5571 Changed: 5571 Warnings: 0
mysql>
mysql> update events set profile_id = 3 where mod(id, 5000) = 1;
2016-10-12T17:09:20.325390Z 16419 [Warning] DEBUG: Full table scan on table events. Reason: 'quick == 0' Records2: 557397 Records3: 557397 Optimizer_flag: 523775 Query: update events set profile_id = 3 where mod(id, 5000) = 1
Query OK, 0 rows affected (34.19 sec)
Rows matched: 111 Changed: 0 Warnings: 0
mysql> explain select * from events e join profiles p on e.profile_id
-> = p.id where event_date between now() - interval 1 day and now() and
-> status = 'active' and profile_id in
-> (1,2,4,5,6,7,8,9,10,11,12,13,14,15,16) order by event_date desc limit
-> 10\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: p
partitions: NULL
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: NULL
rows: 15
filtered: 10.00
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: e
partitions: NULL
type: ref
possible_keys: profile_id,event_date,comb
key: comb
key_len: 5
ref: sinbas.p.id
rows: 12
filtered: 20.10
Extra: Using where; Using index
2 rows in set, 1 warning (0.01 sec)
mysql> explain select STRAIGHT_JOIN * from events e join profiles p on
-> e.profile_id = p.id where event_date between now() - interval 1 day and
-> now() and status = 'active' and profile_id in
-> (1,2,4,5,6,7,8,9,10,11,12,13,14,15,16) order by event_date desc limit
-> 10\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: e
partitions: NULL
type: index
possible_keys: profile_id,event_date,comb
key: event_date
key_len: 6
ref: NULL
rows: 536
filtered: 9.97
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: p
partitions: NULL
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: sinbas.e.profile_id
rows: 1
filtered: 10.00
Extra: Using where
2 rows in set, 1 warning (0.00 sec)
mysql> select STRAIGHT_JOIN * from events e join profiles p on
-> e.profile_id = p.id where event_date between now() - interval 1 day and
-> now() and status = 'active' and profile_id in
-> (1,2,4,5,6,7,8,9,10,11,12,13,14,15,16) order by event_date desc limit
-> 10;
+----+---------------------+------------+----+-------+--------+
| id | event_date | profile_id | id | email | status |
+----+---------------------+------------+----+-------+--------+
| 2 | 2016-10-12 18:49:03 | 1 | 1 | xx@yy | active |
| 3 | 2016-10-12 18:49:02 | 1 | 1 | xx@yy | active |
| 4 | 2016-10-12 18:49:01 | 1 | 1 | xx@yy | active |
| 5 | 2016-10-12 18:49:00 | 1 | 1 | xx@yy | active |
| 6 | 2016-10-12 18:48:59 | 1 | 1 | xx@yy | active |
| 7 | 2016-10-12 18:48:58 | 1 | 1 | xx@yy | active |
| 8 | 2016-10-12 18:48:57 | 1 | 1 | xx@yy | active |
| 9 | 2016-10-12 18:48:56 | 1 | 1 | xx@yy | active |
| 10 | 2016-10-12 18:48:55 | 1 | 1 | xx@yy | active |
| 11 | 2016-10-12 18:48:54 | 1 | 1 | xx@yy | active |
+----+---------------------+------------+----+-------+--------+
10 rows in set (0.01 sec)
[12 Oct 2016 17:20]
MySQL Verification Team
Thank you for your report. Fully verified.
[13 Feb 2017 15:35]
Alexander Rubin
5.6 is also affected
[24 Apr 2017 21:32]
Alexander Rubin
8.0.1 is also affected
[13 Apr 2018 3:59]
Erlend Dahl
Bug#88786 Optimzer choose wrong index was marked as a duplicate.

Description: Optimizer chooses wrong plan when joining 2 tables: mysql> explain select * from events_new_distr e join profiles p on e.profile_id = p.id where event_date between '2016-01-01' and '2016-01-01' + interval 15 day and status = 'active' and profile_id in (1,2,4,5,6,7,8,9,10,11,12,13,14,15,16) order by event_date desc limit 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: p partitions: NULL type: range possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: NULL rows: 15 filtered: 10.00 Extra: Using where; Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: e partitions: NULL type: ref possible_keys: profile_id,event_date,comb key: comb key_len: 5 ref: optimizer_issue.p.id rows: 75696 filtered: 3.28 Extra: Using where; Using index 2 rows in set, 1 warning (0.00 sec) Response time: 40 seconds. The problem here is the optimizer does not choose the right table order and decided to use temp table. The right way is this: mysql> explain select STRAIGHT_JOIN * from events_new_distr e join profiles p on e.profile_id = p.id where event_date between '2016-01-01' and '2016-01-01' + interval 15 day and status = 'active' and profile_id in (1,2,4,5,6,7,8,9,10,11,12,13,14,15,16) order by event_date desc limit 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: e partitions: NULL type: range possible_keys: profile_id,event_date,comb key: event_date key_len: 6 ref: NULL rows: 1703196 filtered: 50.00 Extra: Using index condition; Using where *************************** 2. row *************************** id: 1 select_type: SIMPLE table: p partitions: NULL type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: optimizer_issue.e.profile_id rows: 1 filtered: 10.00 Extra: Using where 2 rows in set, 1 warning (0.01 sec) With ORDER BY + LIMIT 10 MySQL can: 1. Avoid creating temp table and scan the table in the order of the index 2. Abort the query execution when it will find 10 rows. (That is ORDER BY + LIMIT optimization, http://dev.mysql.com/doc/refman/5.7/en/limit-optimization.html) However, 75696*15 = 1135440 rows < 1703196 rows. What is also interesting is that the second plan is not even considered in optimizer trace, attached. How to repeat: Smaller test case: CREATE TABLE `profiles` ( `id` int(11) NOT NULL DEFAULT '0' AUTO_INCREMENT, `email` varchar(255) DEFAULT NULL, `status` varchar(100) DEFAULT NULL, PRIMARY KEY (`id`), KEY `email` (`email`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 mysql> insert into profiles(status) values ('active'); Query OK, 1 row affected (0.00 sec) ... mysql> insert into profiles (status) select status from profiles; Query OK, 16384 rows affected (0.20 sec) Records: 16384 Duplicates: 0 Warnings: 0 CREATE TABLE `events` ( `id` int(11) NOT NULL AUTO_INCREMENT, `event_date` datetime DEFAULT NULL, `profile_id` int(11) DEFAULT NULL, PRIMARY KEY (`id`), KEY `profile_id` (`profile_id`), KEY `event_date` (`event_date`), KEY `comb` (`profile_id`,`event_date`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 mysql> insert into events (event_date, profile_id) select now() - interval id second, id from profiles; Query OK, 32768 rows affected (0.93 sec) Records: 32768 Duplicates: 0 Warnings: 0 ... up to 425984 rows mysql> update events set event_date = now() - interval id second; Query OK, 425984 rows affected (15.88 sec) Rows matched: 425984 Changed: 425984 Warnings: 0 mysql> update events set profile_id = 1; Query OK, 425971 rows affected (15.41 sec) Rows matched: 425984 Changed: 425971 Warnings: 0 mysql> update events set profile_id = 2 where mod(id, 1000) = 1; Query OK, 426 rows affected (0.41 sec) Rows matched: 426 Changed: 426 Warnings: 0 mysql> update events set profile_id = 3 where mod(id, 100) = 1; Query OK, 4260 rows affected (0.55 sec) Rows matched: 4260 Changed: 4260 Warnings: 0 mysql> update events set profile_id = 3 where mod(id, 5000) = 1; Query OK, 0 rows affected (0.39 sec) Rows matched: 84 Changed: 0 Warnings: 0 mysql> explain select * from events e join profiles p on e.profile_id = p.id where event_date between now() - interval 1 day and now() and status = 'active' and profile_id in (1,2,4,5,6,7,8,9,10,11,12,13,14,15,16) order by event_date desc limit 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: p partitions: NULL type: range possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: NULL rows: 15 filtered: 10.00 Extra: Using where; Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: e partitions: NULL type: ref possible_keys: profile_id,event_date,comb key: profile_id key_len: 5 ref: test.p.id rows: 1 filtered: 28.74 Extra: Using where 2 rows in set, 1 warning (0.00 sec) mysql> select * from events e join profiles p on e.profile_id = p.id where event_date between now() - interval 1 day and now() and status = 'active' and profile_id in (1,2,4,5,6,7,8,9,10,11,12,13,14,15,16) order by event_date desc limit 10; +----+---------------------+------------+----+-------+--------+ | id | event_date | profile_id | id | email | status | +----+---------------------+------------+----+-------+--------+ | 2 | 2016-10-10 17:50:04 | 1 | 1 | NULL | active | | 3 | 2016-10-10 17:50:03 | 1 | 1 | NULL | active | | 4 | 2016-10-10 17:50:02 | 1 | 1 | NULL | active | | 5 | 2016-10-10 17:50:01 | 1 | 1 | NULL | active | | 6 | 2016-10-10 17:50:00 | 1 | 1 | NULL | active | | 7 | 2016-10-10 17:49:59 | 1 | 1 | NULL | active | | 8 | 2016-10-10 17:49:58 | 1 | 1 | NULL | active | | 9 | 2016-10-10 17:49:57 | 1 | 1 | NULL | active | | 10 | 2016-10-10 17:49:56 | 1 | 1 | NULL | active | | 11 | 2016-10-10 17:49:55 | 1 | 1 | NULL | active | +----+---------------------+------------+----+-------+--------+ 10 rows in set (0.84 sec) mysql> explain select STRAIGHT_JOIN * from events e join profiles p on e.profile_id = p.id where event_date between now() - interval 1 day and now() and status = 'active' and profile_id in (1,2,4,5,6,7,8,9,10,11,12,13,14,15,16) order by event_date desc limit 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: e partitions: NULL type: index possible_keys: profile_id,event_date,comb key: event_date key_len: 6 ref: NULL rows: 394 filtered: 14.30 Extra: Using where *************************** 2. row *************************** id: 1 select_type: SIMPLE table: p partitions: NULL type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: test.e.profile_id rows: 1 filtered: 10.00 Extra: Using where 2 rows in set, 1 warning (0.00 sec) mysql> select STRAIGHT_JOIN * from events e join profiles p on e.profile_id = p.id where event_date between now() - interval 1 day and now() and status = 'active' and profile_id in (1,2,4,5,6,7,8,9,10,11,12,13,14,15,16) order by event_date desc limit 10; +----+---------------------+------------+----+-------+--------+ | id | event_date | profile_id | id | email | status | +----+---------------------+------------+----+-------+--------+ | 2 | 2016-10-10 17:50:04 | 1 | 1 | NULL | active | | 3 | 2016-10-10 17:50:03 | 1 | 1 | NULL | active | | 4 | 2016-10-10 17:50:02 | 1 | 1 | NULL | active | | 5 | 2016-10-10 17:50:01 | 1 | 1 | NULL | active | | 6 | 2016-10-10 17:50:00 | 1 | 1 | NULL | active | | 7 | 2016-10-10 17:49:59 | 1 | 1 | NULL | active | | 8 | 2016-10-10 17:49:58 | 1 | 1 | NULL | active | | 9 | 2016-10-10 17:49:57 | 1 | 1 | NULL | active | | 10 | 2016-10-10 17:49:56 | 1 | 1 | NULL | active | | 11 | 2016-10-10 17:49:55 | 1 | 1 | NULL | active | +----+---------------------+------------+----+-------+--------+ 10 rows in set (0.00 sec) Suggested fix: Apply the ORDER BY + LIMIT optimization to the table joins