Bug #19649 | Extremely slow join | ||
---|---|---|---|
Submitted: | 9 May 2006 17:57 | Modified: | 10 Oct 2006 18:58 |
Reporter: | Andre Timmer | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
Version: | 5.0.25-BK, 5.0.18 | OS: | Solaris (Solaris) |
Assigned to: | Sergey Petrunya | CPU Architecture: | Any |
[9 May 2006 17:57]
Andre Timmer
[9 May 2006 18:27]
Andre Timmer
Want to upload the data, it's too big (30MB zipped). Unless you guys know about this problem / behaviour you need this many data to reproduce.
[10 May 2006 9:22]
Hartmut Holzgraefe
Could you add the EXPLAIN output for the queries to this bug report? And if you want to provide your test data you can upload it to ftp://ftp.mysql.com/pub/mysql/upload Please use a file name that starts with bug19649_ if you do so and drop a note on this bug report giving the exact file name you uploaded.
[10 May 2006 10:44]
Andre Timmer
mysql> explain -> select count(*) -> from rawdata_organisaties_marktselect aa -> , sleutel bb -> where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kvk -> and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk -> and aa.lrrorg != bb.lrrorg; +----+-------------+-------+------+-------------------------------------+-----------------+---------+------------------------------------------------------------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+-------------------------------------+-----------------+---------+------------------------------------------------------------------+--------+-------------+ | 1 | SIMPLE | aa | ALL | rawdata_organisaties_marktselect_i3 | NULL | NULL | NULL | 426601 | | | 1 | SIMPLE | bb | ref | ref_sleutel_uk1 | ref_sleutel_uk1 | 8 | bbr.aa.bkwi_lrrkvk_inschrijfnr_kvk,bbr.aa.bkwi_lrrkvk_volgnr_kvk | 1 | Using where | +----+-------------+-------+------+-------------------------------------+-----------------+---------+------------------------------------------------------------------+--------+-------------+ Uploaded file "bug19649_ if_slow-join-data.sql.tgz". Data may only be used internally!
[21 May 2006 11:31]
Valeriy Kravchuk
Please, send the EXPLAIN results for the fast query: EXPLAIN select count(*) from rawdata_organisaties_marktselect aa where exists ( select '' from sleutel bb where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kvk and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk and aa.lrrorg != bb.lrrorg )\G
[22 May 2006 16:17]
Andre Timmer
*************************** 1. row *************************** id: 1 select_type: PRIMARY table: aa type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 312218 Extra: Using where *************************** 2. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: bb type: ref possible_keys: ref_sleutel_uk1 key: ref_sleutel_uk1 key_len: 8 ref: bbr.aa.bkwi_lrrkvk_inschrijfnr_kvk,bbr.aa.bkwi_lrrkvk_volgnr_kvk rows: 1 Extra: Using where
[22 May 2006 16:21]
Andre Timmer
This explain is on the exact same database: *************************** 1. row *************************** id: 1 select_type: PRIMARY table: aa type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 426601 Extra: Using where *************************** 2. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: bb type: ref possible_keys: ref_sleutel_uk1 key: ref_sleutel_uk1 key_len: 8 ref: bbr.aa.bkwi_lrrkvk_inschrijfnr_kvk,bbr.aa.bkwi_lrrkvk_volgnr_kvk rows: 1 Extra: Using where
[13 Jun 2006 16:00]
Valeriy Kravchuk
Please, check with a newer version of MySQL server, 5.0.22. If original query is still slow, please, restart server with slow query log (http://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html), run it again and wait until it will be exacuted completely. Then send that slow query log content. Alrenatively, can you, please, send/upload your data that demonstrates this strange behaviour.
[13 Jun 2006 16:20]
Andre Timmer
I would love to but we just upgraded from MySQL 4 and will not do a upgrade this year. If this call turns out to be a bug / performance issue MySQL users working with serious datasets will benefit. Creating and uploading the dataset was already a lot of work. I did it knowing it's hard to reproduce without it and this issue is really important. So it's your turn now please don't let US down :-)
[14 Jun 2006 9:30]
Valeriy Kravchuk
So, where can I find that data set? I see no files attached to the report. Have you uploaded it to our ftp server? Can I download it from somewhere?
[14 Jun 2006 9:58]
Andre Timmer
Uploaded file "bug19649_ if_slow-join-data.sql.tgz" on may 10. Can you find it?
[5 Jul 2006 19:31]
Valeriy Kravchuk
I had downloaded your test case successfully. I hope, I'll be able to check this bug on latest 5.0.24-BK soon.
[17 Jul 2006 15:58]
Valeriy Kravchuk
Verified just as described, on your test case uploaded, with 5.0.25-BK on Linux: openxs@suse:~/dbs/5.0> bin/mysql -uroot bug19649 < /tmp/slow-join-data.sql openxs@suse:~/dbs/5.0> bin/mysql -uroot bug19649 Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 3 to server version: 5.0.25 Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> select count(*) -> from rawdata_organisaties_marktselect aa -> where exists ( -> select '' -> from sleutel bb -> where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kv k -> and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk -> and aa.lrrorg != bb.lrrorg -> ); +----------+ | count(*) | +----------+ | 3 | +----------+ 1 row in set (23.12 sec) mysql> explain select count(*) -> from rawdata_organisaties_marktselect aa -> where exists ( -> select '' -> from sleutel bb -> where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kv k -> and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk -> and aa.lrrorg != bb.lrrorg -> ); +----+--------------------+-------+------+-----------------+-----------------+---------+------------------------------------------------------------------------ ----+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-------+------+-----------------+-----------------+---------+------------------------------------------------------------------------ ----+--------+-------------+ | 1 | PRIMARY | aa | ALL | NULL | NULL | NULL | NULL | 426601 | Using where | | 2 | DEPENDENT SUBQUERY | bb | ref | ref_sleutel_uk1 | ref_sleutel_uk1 | 8 | bug19649.aa.bkwi_lrrkvk_inschrijfnr_kvk,bug19649.aa.bkwi_lrrkvk_volgnr_kvk | 1 | Using where | +----+--------------------+-------+------+-----------------+-----------------+---------+------------------------------------------------------------------------ ----+--------+-------------+ 2 rows in set (0.01 sec) mysql> explain select count(*) -> from rawdata_organisaties_marktselect aa -> , sleutel bb -> where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kvk -> and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk -> and aa.lrrorg != bb.lrrorg; +----+-------------+-------+------+-------------------------------------+-----------------+---------+----------------------------------------------------------- -----------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+-------------------------------------+-----------------+---------+----------------------------------------------------------- -----------------+--------+-------------+ | 1 | SIMPLE | aa | ALL | rawdata_organisaties_marktselect_i3 | NULL | NULL | NULL | 426601 | | | 1 | SIMPLE | bb | ref | ref_sleutel_uk1 | ref_sleutel_uk1 | 8 | bug19649.aa.bkwi_lrrkvk_inschrijfnr_kvk,bug19649.aa.bkwi_lrrkvk_volgnr_kvk | 1 | Using where | +----+-------------+-------+------+-------------------------------------+-----------------+---------+----------------------------------------------------------- -----------------+--------+-------------+ 2 rows in set (0.02 sec) mysql> select count(*) -> from rawdata_organisaties_marktselect aa -> , sleutel bb -> where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kvk -> and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk -> and aa.lrrorg != bb.lrrorg; This query, with almost "the same plan", hangs for a long time: mysql> show processlist\G *************************** 1. row *************************** Id: 3 User: root Host: localhost db: bug19649 Command: Query Time: 390 State: Sending data Info: select count(*) from rawdata_organisaties_marktselect aa , sleutel bb where aa.bkwi_lrrkvk_i *************************** 2. row *************************** Id: 4 User: root Host: localhost db: test Command: Query Time: 0 State: NULL Info: show processlist 2 rows in set (0.00 sec) 390 seconds for now. So, I think, it is a bug.
[1 Aug 2006 14:56]
Sergey Petrunya
BUG#21390 may be related symptoms are similar
[9 Sep 2006 17:53]
Sergey Petrunya
Not repeatable anymore with the latest 5.1-opt tree explain select count(*) from rawdata_organisaties_marktselect aa where exists ( select '' from sleutel bb where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kvk and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk and aa.lrrorg != bb.lrrorg ); -- around 14 sec explain select count(*) from rawdata_organisaties_marktselect aa , sleutel bb where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kvk and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk and aa.lrrorg != bb.lrrorg; -- around 12 sec The last cset was: ChangeSet@1.2261, 2006-09-09 09:43:09-07:00, igor@rurik.mysql.com +6 -0 Post-pushbuild corrections for fix of bug #21698.
[9 Sep 2006 17:55]
Sergey Petrunya
Having analysis/fix of BUG#21390, the earlier statement about this bug being similar to BUG#21390 doesn't look true anymore.
[14 Sep 2006 17:39]
Sergey Petrunya
Fixed by this side improvement in fix for BUG#21390: --- 1.446/sql/sql_select.cc 2006-08-25 02:17:49 -07:00 +++ 1.447/sql/sql_select.cc 2006-08-25 02:17:49 -07:00 @@ -2696,7 +2725,8 @@ We use null_rejecting in add_not_null_conds() to add 'othertbl.field IS NOT NULL' to tab->select_cond. */ - (*key_fields)->null_rejecting= ((cond->functype() == Item_func::EQ_FUNC) && + (*key_fields)->null_rejecting= ((cond->functype() == Item_func::EQ_FUNC || + cond->functype() == Item_func::MULT_EQUAL_FUNC) && ((*value)->type() == Item::FIELD_ITEM) && ((Item_field*)*value)->field->maybe_null()); (*key_fields)++; The effect of the fix: * the WHERE clause can be converted to "t2.key = t1.field AND everything-else" * table t1 is before table t2 in choosen join order. * table t2 is accessed via ref(t2.key = t1.field) then immediately after reading table t1 we'll check if t1.field is NULL (as in that case it does not make sense to continue).
[9 Oct 2006 14:19]
Sergey Petrunya
Notes for the changelog, short version: If the query had a condition in form "tableX.key=tableY.key", which participated in equality propagation and also was used for ref access, then early ref-access NULL filtering was not peformed for the condition. This could make query execution slower.
[9 Oct 2006 14:19]
Sergey Petrunya
Notes for the changelog, long version: The problem was that: If the query had a condition in form "tableX.key=tableY.key", and - that condition was participating in equality propagation (a yet-to-be documented process of making inferences like (X=Y && Y=Z) |- X=Z); - that condition was used for ref access, then { the optimizer would not have used "early NULL-filtering". (Early null-filtering is not yet documented either. It was introduced as fix for BUG#8877. It's idea is that if we have a join order of ..., tblX, ... tblY,... and for tblY we use ref access on tblY.key=tblX.field then we can infer that "tblX.field IS NOT NULL", and this condition can be checked right after we got a row for tblX. This is earlier then we could make a check for "tblY.key=tblX.field", and can provide significant speedups in some cases. }
[10 Oct 2006 18:58]
Paul DuBois
Noted in 5.0.25, 5.1.12 changelogs.