| Bug #19649 | Extremely slow join | ||
|---|---|---|---|
| Submitted: | 9 May 2006 17:57 | Modified: | 10 Oct 2006 18:58 |
| Reporter: | Andre Timmer | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
| Version: | 5.0.25-BK, 5.0.18 | OS: | Solaris (Solaris) |
| Assigned to: | Sergey Petrunya | CPU Architecture: | Any |
[9 May 2006 18:27]
Andre Timmer
Want to upload the data, it's too big (30MB zipped). Unless you guys know about this problem / behaviour you need this many data to reproduce.
[10 May 2006 9:22]
Hartmut Holzgraefe
Could you add the EXPLAIN output for the queries to this bug report? And if you want to provide your test data you can upload it to ftp://ftp.mysql.com/pub/mysql/upload Please use a file name that starts with bug19649_ if you do so and drop a note on this bug report giving the exact file name you uploaded.
[10 May 2006 10:44]
Andre Timmer
mysql> explain
-> select count(*)
-> from rawdata_organisaties_marktselect aa
-> , sleutel bb
-> where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kvk
-> and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk
-> and aa.lrrorg != bb.lrrorg;
+----+-------------+-------+------+-------------------------------------+-----------------+---------+------------------------------------------------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------------------------+-----------------+---------+------------------------------------------------------------------+--------+-------------+
| 1 | SIMPLE | aa | ALL | rawdata_organisaties_marktselect_i3 | NULL | NULL | NULL | 426601 | |
| 1 | SIMPLE | bb | ref | ref_sleutel_uk1 | ref_sleutel_uk1 | 8 | bbr.aa.bkwi_lrrkvk_inschrijfnr_kvk,bbr.aa.bkwi_lrrkvk_volgnr_kvk | 1 | Using where |
+----+-------------+-------+------+-------------------------------------+-----------------+---------+------------------------------------------------------------------+--------+-------------+
Uploaded file "bug19649_ if_slow-join-data.sql.tgz".
Data may only be used internally!
[21 May 2006 11:31]
Valeriy Kravchuk
Please, send the EXPLAIN results for the fast query:
EXPLAIN select count(*)
from rawdata_organisaties_marktselect aa
where exists (
select ''
from sleutel bb
where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kvk
and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk
and aa.lrrorg != bb.lrrorg
)\G
[22 May 2006 16:17]
Andre Timmer
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: aa
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 312218
Extra: Using where
*************************** 2. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: bb
type: ref
possible_keys: ref_sleutel_uk1
key: ref_sleutel_uk1
key_len: 8
ref:
bbr.aa.bkwi_lrrkvk_inschrijfnr_kvk,bbr.aa.bkwi_lrrkvk_volgnr_kvk
rows: 1
Extra: Using where
[22 May 2006 16:21]
Andre Timmer
This explain is on the exact same database:
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: aa
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 426601
Extra: Using where
*************************** 2. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: bb
type: ref
possible_keys: ref_sleutel_uk1
key: ref_sleutel_uk1
key_len: 8
ref: bbr.aa.bkwi_lrrkvk_inschrijfnr_kvk,bbr.aa.bkwi_lrrkvk_volgnr_kvk
rows: 1
Extra: Using where
[13 Jun 2006 16:00]
Valeriy Kravchuk
Please, check with a newer version of MySQL server, 5.0.22. If original query is still slow, please, restart server with slow query log (http://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html), run it again and wait until it will be exacuted completely. Then send that slow query log content. Alrenatively, can you, please, send/upload your data that demonstrates this strange behaviour.
[13 Jun 2006 16:20]
Andre Timmer
I would love to but we just upgraded from MySQL 4 and will not do a upgrade this year. If this call turns out to be a bug / performance issue MySQL users working with serious datasets will benefit. Creating and uploading the dataset was already a lot of work. I did it knowing it's hard to reproduce without it and this issue is really important. So it's your turn now please don't let US down :-)
[14 Jun 2006 9:30]
Valeriy Kravchuk
So, where can I find that data set? I see no files attached to the report. Have you uploaded it to our ftp server? Can I download it from somewhere?
[14 Jun 2006 9:58]
Andre Timmer
Uploaded file "bug19649_ if_slow-join-data.sql.tgz" on may 10. Can you find it?
[5 Jul 2006 19:31]
Valeriy Kravchuk
I had downloaded your test case successfully. I hope, I'll be able to check this bug on latest 5.0.24-BK soon.
[17 Jul 2006 15:58]
Valeriy Kravchuk
Verified just as described, on your test case uploaded, with 5.0.25-BK on Linux: openxs@suse:~/dbs/5.0> bin/mysql -uroot bug19649 < /tmp/slow-join-data.sql openxs@suse:~/dbs/5.0> bin/mysql -uroot bug19649 Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 3 to server version: 5.0.25 Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> select count(*) -> from rawdata_organisaties_marktselect aa -> where exists ( -> select '' -> from sleutel bb -> where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kv k -> and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk -> and aa.lrrorg != bb.lrrorg -> ); +----------+ | count(*) | +----------+ | 3 | +----------+ 1 row in set (23.12 sec) mysql> explain select count(*) -> from rawdata_organisaties_marktselect aa -> where exists ( -> select '' -> from sleutel bb -> where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kv k -> and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk -> and aa.lrrorg != bb.lrrorg -> ); +----+--------------------+-------+------+-----------------+-----------------+---------+------------------------------------------------------------------------ ----+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-------+------+-----------------+-----------------+---------+------------------------------------------------------------------------ ----+--------+-------------+ | 1 | PRIMARY | aa | ALL | NULL | NULL | NULL | NULL | 426601 | Using where | | 2 | DEPENDENT SUBQUERY | bb | ref | ref_sleutel_uk1 | ref_sleutel_uk1 | 8 | bug19649.aa.bkwi_lrrkvk_inschrijfnr_kvk,bug19649.aa.bkwi_lrrkvk_volgnr_kvk | 1 | Using where | +----+--------------------+-------+------+-----------------+-----------------+---------+------------------------------------------------------------------------ ----+--------+-------------+ 2 rows in set (0.01 sec) mysql> explain select count(*) -> from rawdata_organisaties_marktselect aa -> , sleutel bb -> where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kvk -> and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk -> and aa.lrrorg != bb.lrrorg; +----+-------------+-------+------+-------------------------------------+-----------------+---------+----------------------------------------------------------- -----------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+-------------------------------------+-----------------+---------+----------------------------------------------------------- -----------------+--------+-------------+ | 1 | SIMPLE | aa | ALL | rawdata_organisaties_marktselect_i3 | NULL | NULL | NULL | 426601 | | | 1 | SIMPLE | bb | ref | ref_sleutel_uk1 | ref_sleutel_uk1 | 8 | bug19649.aa.bkwi_lrrkvk_inschrijfnr_kvk,bug19649.aa.bkwi_lrrkvk_volgnr_kvk | 1 | Using where | +----+-------------+-------+------+-------------------------------------+-----------------+---------+----------------------------------------------------------- -----------------+--------+-------------+ 2 rows in set (0.02 sec) mysql> select count(*) -> from rawdata_organisaties_marktselect aa -> , sleutel bb -> where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kvk -> and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk -> and aa.lrrorg != bb.lrrorg; This query, with almost "the same plan", hangs for a long time: mysql> show processlist\G *************************** 1. row *************************** Id: 3 User: root Host: localhost db: bug19649 Command: Query Time: 390 State: Sending data Info: select count(*) from rawdata_organisaties_marktselect aa , sleutel bb where aa.bkwi_lrrkvk_i *************************** 2. row *************************** Id: 4 User: root Host: localhost db: test Command: Query Time: 0 State: NULL Info: show processlist 2 rows in set (0.00 sec) 390 seconds for now. So, I think, it is a bug.
[1 Aug 2006 14:56]
Sergey Petrunya
BUG#21390 may be related symptoms are similar
[9 Sep 2006 17:53]
Sergey Petrunya
Not repeatable anymore with the latest 5.1-opt tree
explain
select count(*)
from rawdata_organisaties_marktselect aa
where exists (
select ''
from sleutel bb
where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kvk
and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk
and aa.lrrorg != bb.lrrorg
);
-- around 14 sec
explain
select count(*)
from rawdata_organisaties_marktselect aa
, sleutel bb
where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kvk
and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk
and aa.lrrorg != bb.lrrorg;
-- around 12 sec
The last cset was:
ChangeSet@1.2261, 2006-09-09 09:43:09-07:00, igor@rurik.mysql.com +6 -0
Post-pushbuild corrections for fix of bug #21698.
[9 Sep 2006 17:55]
Sergey Petrunya
Having analysis/fix of BUG#21390, the earlier statement about this bug being similar to BUG#21390 doesn't look true anymore.
[14 Sep 2006 17:39]
Sergey Petrunya
Fixed by this side improvement in fix for BUG#21390: --- 1.446/sql/sql_select.cc 2006-08-25 02:17:49 -07:00 +++ 1.447/sql/sql_select.cc 2006-08-25 02:17:49 -07:00 @@ -2696,7 +2725,8 @@ We use null_rejecting in add_not_null_conds() to add 'othertbl.field IS NOT NULL' to tab->select_cond. */ - (*key_fields)->null_rejecting= ((cond->functype() == Item_func::EQ_FUNC) && + (*key_fields)->null_rejecting= ((cond->functype() == Item_func::EQ_FUNC || + cond->functype() == Item_func::MULT_EQUAL_FUNC) && ((*value)->type() == Item::FIELD_ITEM) && ((Item_field*)*value)->field->maybe_null()); (*key_fields)++; The effect of the fix: * the WHERE clause can be converted to "t2.key = t1.field AND everything-else" * table t1 is before table t2 in choosen join order. * table t2 is accessed via ref(t2.key = t1.field) then immediately after reading table t1 we'll check if t1.field is NULL (as in that case it does not make sense to continue).
[9 Oct 2006 14:19]
Sergey Petrunya
Notes for the changelog, short version: If the query had a condition in form "tableX.key=tableY.key", which participated in equality propagation and also was used for ref access, then early ref-access NULL filtering was not peformed for the condition. This could make query execution slower.
[9 Oct 2006 14:19]
Sergey Petrunya
Notes for the changelog, long version:
The problem was that:
If the query had a condition in form "tableX.key=tableY.key", and
- that condition was participating in equality propagation (a yet-to-be
documented process of making inferences like (X=Y && Y=Z) |- X=Z);
- that condition was used for ref access,
then
{
the optimizer would not have used "early NULL-filtering".
(Early null-filtering is not yet documented either. It was introduced as
fix for BUG#8877. It's idea is that if we have a join order of
..., tblX, ... tblY,...
and for tblY we use ref access on
tblY.key=tblX.field
then we can infer that "tblX.field IS NOT NULL", and this condition can
be checked right after we got a row for tblX. This is earlier then we
could make a check for "tblY.key=tblX.field", and can provide
significant speedups in some cases.
}
[10 Oct 2006 18:58]
Paul DuBois
Noted in 5.0.25, 5.1.12 changelogs.

Description: Following select is slow (more than 10 minutes, didn't wait for query to finish). Query plan is OK. select count(*) from rawdata_organisaties_marktselect aa , sleutel bb where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kvk and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk and aa.lrrorg != bb.lrrorg; How to repeat: -------------- - table I -------------- CREATE TABLE `rawdata_organisaties_marktselect` ( `bkwi_row` int(10) unsigned NOT NULL auto_increment, `lrrorg` int(11) NOT NULL default '0', `lrrnm1` varchar(30) default NULL, `lrrnm2` varchar(30) default NULL, `lrrstr` varchar(24) default NULL, `lrrhnr` int(6) default NULL, `lrrthn` varchar(12) default NULL, `lrrpcd` varchar(7) default NULL, `bkwi_lrrpcd` varchar(7) default NULL, `lrrplt` varchar(25) default NULL, `lrrstr_ca` varchar(24) default NULL, `lrrhnr_ca` int(6) default NULL, `lrrthn_ca` varchar(12) default NULL, `lrrpcd_ca` varchar(7) default NULL, `bkwi_lrrpcd_ca` varchar(7) default NULL, `lrrplt_ca` varchar(25) default NULL, `lrrtel` varchar(15) default NULL, `lrrfax` varchar(15) default NULL, `lreadr` varchar(50) default NULL, `lrrnac` varchar(6) default NULL, `lrraoc` tinyint(3) unsigned zerofill default NULL, `lrrawr` varchar(6) default NULL, `lrrsts` char(1) character set latin1 default NULL, `bkwi_lrrsts_ind_surseance` tinyint(2) default NULL, `bkwi_lrrsts_ind_faillissement` tinyint(2) default NULL, `bkwi_lrrsts_ind_beeindiging` tinyint(2) default NULL, `lrrkvk` varchar(12) character set latin1 default NULL, `bkwi_lrrkvk_inschrijfnr_kvk` int(8) unsigned zerofill default NULL, `bkwi_lrrkvk_volgnr_kvk` smallint(4) unsigned zerofill default NULL, `lrrhnv` char(1) character set latin1 default NULL, `bkwi_lrrhnv` char(1) character set latin1 default NULL, `lrr45` varchar(45) default NULL, `lrrrvc` smallint(3) unsigned zerofill default NULL, `bkwi_lrrrvc` smallint(3) default NULL, `lrrprc` char(2) character set latin1 default NULL, `bkwi_lrrprc` char(1) character set latin1 default NULL, `lrrgmc` varchar(5) default NULL, `lrreca` char(1) character set latin1 default NULL, `bkwi_lrreca` char(1) character set latin1 default NULL, `lrwadr` varchar(50) default NULL, `lrmoed` int(11) default NULL, `lrmkvk` varchar(12) character set latin1 default NULL, `lrrup` int(11) default NULL, `lrukvk` varchar(12) character set latin1 default NULL, `lrrmio` char(1) character set latin1 default NULL, `bkwi_lrrmio` tinyint(2) default NULL, `lrrmlsa` char(1) character set latin1 default NULL, `lrrmlsp` char(1) character set latin1 default NULL, `bkwi_created` datetime default NULL, `bkwi_uitval` varchar(120) default NULL, PRIMARY KEY (`bkwi_row`), KEY `rawdata_organisaties_marktselect_i1` (`lrrkvk`), KEY `rawdata_organisaties_marktselect_i2` (`lrrorg`), KEY `rawdata_organisaties_marktselect_i3` (`bkwi_lrrkvk_inschrijfnr_kvk`,`bkwi_lrrkvk_volgnr_kvk`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8; -------------- - table II -------------- CREATE TABLE `sleutel` ( `bkwi_id` int(11) NOT NULL default '0', `bkwi_volgnr` smallint(6) NOT NULL default '0', `fictief_finr` int(9) NOT NULL default '0', `inschrijfnr_kvk` int(8) unsigned zerofill default NULL, `volgnr_kvk` smallint(4) unsigned zerofill default NULL, `lrrorg` int(11) default NULL, `created` datetime NOT NULL default '0000-00-00 00:00:00', `created_by` char(1) character set latin1 NOT NULL default 'B', `updated` datetime default NULL, `updated_by` char(1) character set latin1 default NULL, PRIMARY KEY (`bkwi_id`,`bkwi_volgnr`), UNIQUE KEY `ref_sleutel_uk1` (`inschrijfnr_kvk`,`volgnr_kvk`), UNIQUE KEY `ref_sleutel_uk2` (`lrrorg`), KEY `ref_sleutel_i1` (`fictief_finr`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; -------------- - my.cnf -------------- # One can in this file use all long options that the program supports. # If you want to know which options a program support, run the program # with --help option. # The following options will be passed to all MySQL clients [client] port = 3306 socket = /tmp/mysql.sock # Here follows entries for some specific programs # The MySQL server [mysqld] port = 3306 socket = /tmp/mysql.sock skip-locking log-warnings max_allowed_packet = 1M thread_stack = 64K tmpdir=/tmp # changed settings # binlog_cache_size: nieuw dd 19/08/2005, geen idee waar het goed voor is maar stond wel op een erg lage waarde # key_buffer_size: gewijzigd dd 19/08/2005, was 16M ivm adres_googliaans? # join_buffer_size: nieuw dd 19/08/2005 # sort_buffer_size: gewijzigd dd 19/08/2005, was 2M en weer op 2M gezet, te groot niet goed voor performance # (sort buffer wordt per query gealloceerd?) # voor initiele load performance lijkt het niet uit te maken: # - joins spelen geen rol en sorts zijn zo groot gaat toch naar schijf table_cache = 40 binlog_cache_size= 8M key_buffer_size = 64M # join_buffer_size is for every table join when there are no indexes (never?) # this space is allocated per table join per query join_buffer_size = 32M sort_buffer_size = 2M # myisam_sort_buffer_size wordt gebruikt bij ddl myisam_sort_buffer_size = 64M read_buffer_size = 1M read_rnd_buffer_size = 2M net_buffer_length = 8K # new settings # max_error_count heeft betrekking op aantal detail rijen dat 'show warnings' terug geeft # default is 64, max aantal dat getoond wordt lijkt 64k ipv 1.000.000 max_error_count=1000000 # 0 means disabled query_cache_type=0 query_cache_size=256k # CPU's * 2 thread_concurrency=4 long_query_time=5 log-slow-queries=slow-queries.log # dumps to many queries: log-queries-not-using-indexes log-long-format # fulltext index options # gewijzigd dd 19/08/2005, was 2, enige reden is fulltext index op straat ft_min_word_len = 1 # not build-in stopword file, deze functionaliteit is ongewenst ft_stopword_file = '' default-character-set=utf8 # Don't listen on a TCP/IP port at all. This can be a security enhancement, # if all processes that need to connect to mysqld run on the same host. # All interaction with mysqld must be made via Unix sockets or named pipes. # Note that using this option without enabling named pipes on Windows # (using the "enable-named-pipe" option) will render mysqld useless! # #skip-networking server-id = 1 # Uncomment the following if you want to log updates #log-bin # Uncomment the following if you are NOT using BDB tables skip-bdb # Uncomment the following if you are using InnoDB tables innodb_data_home_dir = /usr/local/mysql/data/ innodb_data_file_path = ibdata1:256M:autoextend innodb_log_group_home_dir = /usr/local/mysql/data/ innodb_log_arch_dir = /usr/local/mysql/data/ # You can set .._buffer_pool_size up to 50 - 80 % # of RAM but beware of setting memory usage too high # changed settings innodb_buffer_pool_size = 256M # is for data dictionairy etc. innodb_additional_mem_pool_size = 1M # Set .._log_file_size to 25 % of buffer pool size innodb_log_file_size = 8M innodb_log_buffer_size = 8M # innodb_flush_log_at_trx_commit: 2 --> wacht 1 seconde na commit innodb_flush_log_at_trx_commit = 1 innodb_lock_wait_timeout = 20 [mysqldump] quick max_allowed_packet = 16M [mysql] no-auto-rehash # Remove the next comment character if you are not familiar with SQL #safe-updates default-character-set=utf8 [myisamchk] key_buffer_size = 64M sort_buffer_size = 64M # myisam_sort_buffer_size wordt gebruikt bij ddl myisam_sort_buffer_size = 64M ft_min_word_len = 1 ft_stopword_file = '' [mysqlhotcopy] interactive-timeout Suggested fix: Fix? It could be a MySQL configuration issue. Solaris box has free memory and there seems to be no paging or swapping (i'm not an administrator). The problem query does work when the 2 tables have less rows. Joining tables with many rows especially when the rows have many columns / are bulky seems to be a problem. A query rewrite, see below, is fast and returns in 40s. select count(*) from rawdata_organisaties_marktselect aa where exists ( select '' from sleutel bb where aa.bkwi_lrrkvk_inschrijfnr_kvk = bb.inschrijfnr_kvk and aa.bkwi_lrrkvk_volgnr_kvk = bb.volgnr_kvk and aa.lrrorg != bb.lrrorg );