Bug #42259 | Utilizing a SELECT within a WHERE IN performs poorly (slow) | ||
---|---|---|---|
Submitted: | 22 Jan 2009 6:36 | Modified: | 31 Jul 2012 12:15 |
Reporter: | Van Stokes | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: DML | Severity: | S2 (Serious) |
Version: | 5.5.25 | OS: | Linux (Ubuntu 12.04 x86_64) |
Assigned to: | CPU Architecture: | Any | |
Tags: | IN, SELECT, slow, subquery benchmark, where |
[22 Jan 2009 6:36]
Van Stokes
[22 Jan 2009 8:25]
Sergey Petrunya
Judging from the query text, this case should receive better handling in MySQL 6.0, as it is covered by the new 6.0 subquery optimizations (see http://forge.mysql.com/wiki/Subquery_Works for details). Van, we would appreciate getting the dataset so that we can verify that this case is handled properly in 6.0 (see the link, we're collecting such cases). If you click in o the "Files" tab, it will show instructions how to upload big files to our ftp. Thanks,
[22 Jan 2009 8:53]
Valeriy Kravchuk
Thank you for the problem report. Please, send the results of: EXPLAIN SELECT * FROM inventory WHERE InventoryKey IN ( SELECT DISTINCT inventory.ContInvKey FROM inventory WHERE inventory.OwnerType = 1 AND inventory.OwnerKey = 18 AND inventory.ContInvKey > 0 )\G show create table inventory\G show table status like 'inventory'\G
[22 Jan 2009 9:09]
Van Stokes
I will submit a mysqldump of the database after the next auto backup completes.
[22 Jan 2009 9:19]
Valeriy Kravchuk
Please, upload your dump (compressed) to ftp://ftp.mysql.com/pub/mysql/upload/ and inform about the filename (include bug number, 42259 in it) here. Based on EXPLAIN results, only 6.0.x may help you now, with its new subquery optimization methods.
[22 Jan 2009 11:43]
MySQL Verification Team
Below the output I got on Windows Vista 64-bit and actually there is a great improvement with 6.0 regarding 5.1: g:\share\dbs>g:\share\dbs\6.0\bin\mysql -uroot --port=3600 --prompt="mysql 6.0 > " Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 6.0.10-alpha-nt-log Source distribution Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql 6.0 > use Origins Database changed mysql 6.0 > SELECT DISTINCT inventory.ContInvKey FROM inventory WHERE inventory.OwnerType = 1 AND -> inventory.OwnerKey = 18 AND inventory.ContInvKey > 0\G *************************** 1. row *************************** ContInvKey: 741879 *************************** 2. row *************************** <cut> ContInvKey: 741916 *************************** 6. row *************************** ContInvKey: 741915 6 rows in set (0.67 sec) mysql 6.0 > SELECT * FROM inventory WHERE InventoryKey IN ( SELECT DISTINCT inventory.ContInvKey FROM -> inventory WHERE inventory.OwnerType = 1 AND inventory.OwnerKey = 18 AND -> inventory.ContInvKey > 0 )\G <cut> OrientationY: 0.0000 OrientationZ: 0.0000 DateTimeCreated: 2008-04-23 16:03:17 RecLastModified: 2008-04-23 17:03:17 6 rows in set (0.03 sec) mysql 6.0 > g:\share\dbs>g:\share\dbs\5.1\bin\mysql -uroot --port=3510 --prompt="mysql 5.1 >" Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.1.31-nt-log Source distribution Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql 5.1 >use origins Database changed mysql 5.1 >SELECT DISTINCT inventory.ContInvKey FROM inventory WHERE inventory.OwnerType = 1 AND -> inventory.OwnerKey = 18 AND inventory.ContInvKey > 0\G *************************** 1. row *************************** ContInvKey: 741879 <cut> *************************** 6. row *************************** ContInvKey: 741915 6 rows in set (0.30 sec) mysql 5.1 >SELECT * FROM inventory WHERE InventoryKey IN ( SELECT DISTINCT inventory.ContInvKey FROM -> inventory WHERE inventory.OwnerType = 1 AND inventory.OwnerKey = 18 AND -> inventory.ContInvKey > 0 )\G <cut> OrientationZ: 0.0000 DateTimeCreated: 2008-04-23 16:03:19 RecLastModified: 2008-04-23 17:03:19 6 rows in set (2 min 15.11 sec) mysql 5.1 >
[22 Jan 2009 12:02]
MySQL Verification Team
Could you please provide the create view statement, to test your actual query?. Thanks in advance.
[22 Jan 2009 12:10]
MySQL Verification Team
Sorry disregard my last comment the error I got is: ERROR 1305 (42000): FUNCTION origins.InventoryGetOwnerTypeByName does not exist to run your actual query. Thanks in advance.
[22 Jan 2009 13:57]
MySQL Verification Team
Thank you for the feedback. With the actual query which uses views and function I have not noticed differences between 5.1 and 6.0.: -----+--------------------+-----------------------+ 21 rows in set (8 min 13.11 sec) mysql 5.1 >show variables like "%version%"; -> // +-------------------------+---------------------+ | Variable_name | Value | +-------------------------+---------------------+ | protocol_version | 10 | | version | 5.1.31-nt-log | | version_comment | Source distribution | | version_compile_machine | unknown | | version_compile_os | Win64 | +-------------------------+---------------------+ 5 rows in set (0.44 sec) ------------------------------------------------- 21 rows in set (8 min 2.73 sec) mysql 6.0 > show variables like "%version%"; +-------------------------+---------------------+ | Variable_name | Value | +-------------------------+---------------------+ | protocol_version | 10 | | version | 6.0.10-alpha-nt-log | | version_comment | Source distribution | | version_compile_machine | unknown | | version_compile_os | Win64 | +-------------------------+---------------------+ 5 rows in set (0.04 sec) mysql 6.0 > I changed the Synopsis to reflect that.
[23 Jan 2009 16:12]
Sergey Petrunya
Confirm, the presence of VIEW disables 6.0 subquery optimizations, that's why there is no speedup when a VIEW is used. There is no really valid behind this - the optimizer is perfectly capable of processing such VIEWs. It seems this case got disabled when we've put a (very rough) fix to disable subquery processing in presence of LEFT JOINs (there are real issues with subqueries + outer joins). This can and should be fixed before the 6.0 release
[6 Mar 2009 23:41]
Van Stokes
We installed 6.0.9-alpha on our beta testing server. WOW! What a performance increase from over 5.1.x!!! It is a very noticeable improvement! All our users noticed it. The SELECT WHERE IN (even with views) is instantaneous! Great job! We ran into another bug - repeated calls to a stored procedure via the same thread cause the thread to hang. We are opening a new bug for this one.
[10 Apr 2009 14:00]
MySQL Verification Team
See bug: http://bugs.mysql.com/bug.php?id=44201.
[5 Jun 2009 14:16]
Van Stokes
We have upgraded to MySQL 6.0.11. There is still a performance problem when using WHERE IN. This statement: SELECT inventory_view.* FROM inventory_view WHERE InventoryKey IN ( SELECT DISTINCT inventory.ContInvKey FROM inventory WHERE inventory.OwnerType = mOwnerType AND inventory.OwnerKey = mOwnerKey AND ContInvKey > 0 ) Still takes over 162 seconds to run and fetch 10 rows. If I break the query up into two separate queries like this: SELECT DISTINCT inventory.ContInvKey FROM inventory WHERE inventory.OwnerType = mOwnerType AND inventory.OwnerKey = mOwnerKey AND ContInvKey > 0; ...returns 10 rows and takes 0.0011 seconds to run. And if follow-up with: SELECT inventory_view.* FROM inventory_view WHERE InventoryKey=Result1 OR InventoryKey=Result2 OR InventoryKey=Result2..... It returns 10 rows in 0.029 seconds. Its painfully obvious the query optimizer still isn't working. This is a very *simple* WHERE IN statement. There is no excuse for such poor performance. ------------- EXPLAIN SELECT DISTINCT inventory.ContInvKey FROM inventory WHERE inventory.OwnerType = InventoryGetOwnerTypeByName( "CHARACTER" ) AND inventory.OwnerKey = CharacterGetKeyByCallsign( "AmenRa" ) AND ContInvKey > 0; 1, 'SIMPLE', 'inventory', 'ref', 'Inventory_Cont_IDX,Inventory_Owner_IDX', 'Inventory_Owner_IDX', '5', 'const,const', 68, 'Using where; Using temporary' ------------- EXPLAIN SELECT inventory_view.* FROM inventory_view WHERE InventoryKey IN ( SELECT DISTINCT inventory.ContInvKey FROM inventory WHERE inventory.OwnerType = InventoryGetOwnerTypeByName( "CHARACTER" ) AND inventory.OwnerKey = CharacterGetKeyByCallsign( "AmenRa" ) AND ContInvKey > 0 ); 1, 'PRIMARY', 'inventory', 'ALL', 'Inventory_Item_IDX', '', '', '', 642970, 'Using where' 1, 'PRIMARY', 'item', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'origins.inventory.ItemKey', 1, '' 1, 'PRIMARY', 'inventorylocation', 'eq_ref', 'PRIMARY,InventoryLocation_InventoryKey_fk', 'PRIMARY', '4', 'func', 1, 'Using where' 2, 'DEPENDENT SUBQUERY', 'inventory', 'ref', 'Inventory_Cont_IDX,Inventory_Owner_IDX', 'Inventory_Owner_IDX', '5', 'const,const', 68, 'Using where; Using temporary'
[3 Jul 2009 14:40]
Van Stokes
We tested MySQL server 5.4-beta and the performance problem still persists.
[27 Sep 2009 11:31]
Van Stokes
We have downgraded to 5.4.2 from 6.0.x The WHERE IN is still a major issue. It's extremely slow. The statement takes 13 seconds to run: DELETE FROM Inventory WHERE InventoryKey IN ( SELECT ObjKey FROM MissionObjective WHERE MissionKey = 5 AND ObjKeyType = 3 ); The INVENTORY table has 1M+ rows. The MISSIONOBJECTIVE table has ONE row! EXPLAIN SELECT ObjKey FROM MissionObjective WHERE MissionKey = 5 AND ObjKeyType = 3; id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE MissionObjective ref PRIMARY,MissionObjective_FK01 PRIMARY 5 const,const 1 Using index My original example that started this bug is still an issue too. We use raw table selects and view selects in our queries.
[30 Sep 2009 20:24]
Van Stokes
Windows XP x64 (dedicated) MySQL x64 5.4.2 SELECT InventoryGetKeyByProperty( 10000, "Thesdia Station", 0, 0, 0 ); Execute Time: 00:00:00:032 SELECT * FROM InventoryLocation WHERE InventoryKey = 1011559; Execute Time: 00:00:00:047 SELECT * FROM InventoryLocation WHERE InventoryKey = InventoryGetKeyByProperty( 10000, "Thesdia Station", 0, 0, 0 ); Execute Time: 00:03:37:155 That's THREE MINUTES! SQL is read RIGHT TO LEFT. It is obvious from this simple statement you are reading LEFT TO RIGHT! SOLVE for InventoryGetKeyByProperty() *FIRST* THEN you perform the SELECT! I hate to be so obstinate but this is getting ridiculous.
[7 Dec 2009 15:15]
Tor Didriksen
Is InventoryGetKeyByProperty() DETERMINISTIC? Only parts of the schema can be found in the bug report, so it's hard to reproduce the problems reported.
[30 Dec 2009 2:06]
Jinhua Xi
Hi, I got exactly the same performance issue. I have the following query: SELECT o.orgid, o.org_name FROM organization o WHERE o.orgid IN (SELECT fol_orgid FROM buz_followed WHERE fol_memid = 1112) This query only returns 10 rows and took 360 seconds. Similar queries to the organization table returns almost instantly. The organization is a large table with over 1 million rows. I am using mySQL 5.0.24
[28 Jun 2012 13:54]
Van Stokes
This problem still persists in 5.5.25 and 5.6.m8. The SQL query profiler fails when a SELECT is used in conjunction with a WHERE IN. This is unacceptable performance and should be immediately addressed and corrected. The SIMPLE query: EXPLAIN SELECT DISTINCT invdrs.drcnum FROM invdrs WHERE invdrs.cmpnum = "03" AND invdrs.trmnum = "61" AND invdrs.invnum = "03675"; id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE invdrs ref PRIMARY PRIMARY 15 const,const,const 2 Using where; Using index; Using temporary The ADVANCE QUERY utilizing the above SIMPLE query **STATEMENT** in conjunction with a WHERE IN: (Results in 5.13 seconds) EXPLAIN SELECT DISTINCT invdrs.cmpnum, invdrs.trmnum, invdrs.invnum, ( invoic.amt01 + invoic.amt02 + invoic.amt03 + invoic.amt04 + invoic.amt05 + invoic.amt06 + invoic.amt07 + invoic.amt08 + invoic.amt09 + invoic.amt10 + invoic.amt11 + invoic.amt12 ) AS `Amt` FROM invdrs, invoic WHERE invdrs.drcnum IN ( SELECT DISTINCT i2.drcnum FROM invdrs i2 WHERE i2.cmpnum = "03" AND i2.trmnum = "61" AND i2.invnum = "03675" ) AND invoic.cmpnum = invdrs.cmpnum AND invoic.trmnum = invdrs.trmnum AND invoic.invnum = invdrs.invnum AND invoic.cntnum = invdrs.cntnum; id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY invoic ALL PRIMARY 83561 Using temporary 1 PRIMARY invdrs ref PRIMARY PRIMARY 20 invoic.cmpnum,invoic.trmnum,invoic.invnum,invoic.cntnum 502 Using where; Using index 2 DEPENDENT SUBQUERY i2 index_subquery PRIMARY,invdrs_idx1 invdrs_idx1 10 func 1 Using where Note that SQL DOES NOT UTILIZE the PRIMARY index of "invoic". Review the next query in where SQL WILL utilize it correctly. The ADVANCE QUERY utilizing the first SIMPLE query **RESULTS** in conjunction with a WHERE IN: (Results in 0.0018 seconds) SELECT DISTINCT invdrs.cmpnum, invdrs.trmnum, invdrs.invnum, ( invoic.amt01 + invoic.amt02 + invoic.amt03 + invoic.amt04 + invoic.amt05 + invoic.amt06 + invoic.amt07 + invoic.amt08 + invoic.amt09 + invoic.amt10 + invoic.amt11 + invoic.amt12 ) AS `Amt` FROM invdrs, invoic WHERE invdrs.drcnum IN ( "13H57179", "13H57180", "13H57181" ) AND invoic.cmpnum = invdrs.cmpnum AND invoic.trmnum = invdrs.trmnum AND invoic.invnum = invdrs.invnum AND invoic.cntnum = invdrs.cntnum; id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE invdrs range PRIMARY,invdrs_idx1 invdrs_idx1 10 4 Using where; Using temporary 1 SIMPLE invoic ref PRIMARY PRIMARY 20 invdrs.cmpnum,invdrs.trmnum,invdrs.invnum,invdrs.cntnum 209 Note that SQL DOES UTILIZE the PRIMARY index of "invoic". There should be NO difference in the two queries.
[30 Jul 2012 17:35]
Paul DuBois
Queries of this type are dealt with by the optimizer changes in 5.6.5.
[31 Jul 2012 9:32]
Roy Lyseng
Please add schema information for the tables invdrs and invoic. Unless these are views, the semi-join transformations are supposed to work in MySQL 5.6.5-m8.
[31 Jul 2012 12:15]
Van Stokes
CREATE TABLE `invdrs` ( `cmpnum` varchar(2) NOT NULL DEFAULT '', `trmnum` varchar(2) NOT NULL DEFAULT '', `invnum` varchar(5) NOT NULL DEFAULT '', `cntnum` varchar(3) NOT NULL DEFAULT '', `drcnum` varchar(8) NOT NULL DEFAULT '', `drtype` varchar(1) DEFAULT NULL, PRIMARY KEY (`cmpnum`,`trmnum`,`invnum`,`cntnum`,`drcnum`), KEY `invdrs_idx1` (`drcnum`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 CREATE TABLE `invoic` ( `cmpnum` varchar(2) NOT NULL DEFAULT '', `trmnum` varchar(2) NOT NULL DEFAULT '', `invnum` varchar(5) NOT NULL DEFAULT '', `cntnum` varchar(3) NOT NULL DEFAULT '', `pagnum` varchar(3) NOT NULL DEFAULT '', `entrby` varchar(8) DEFAULT NULL, `entdat` int(8) DEFAULT NULL, `enttim` varchar(6) DEFAULT NULL, `invdte` int(8) DEFAULT NULL, `billto` varchar(5) DEFAULT NULL, `connum` varchar(5) DEFAULT NULL, `invtyp` varchar(2) DEFAULT NULL, `lin01` varchar(75) DEFAULT NULL, `lin02` varchar(75) DEFAULT NULL, `lin03` varchar(75) DEFAULT NULL, `lin04` varchar(75) DEFAULT NULL, `lin05` varchar(75) DEFAULT NULL, `lin06` varchar(75) DEFAULT NULL, `lin07` varchar(75) DEFAULT NULL, `lin08` varchar(75) DEFAULT NULL, `lin09` varchar(75) DEFAULT NULL, `lin10` varchar(75) DEFAULT NULL, `lin11` varchar(75) DEFAULT NULL, `lin12` varchar(75) DEFAULT NULL, `lin13` varchar(75) DEFAULT NULL, `lin14` varchar(75) DEFAULT NULL, `lin15` varchar(75) DEFAULT NULL, `lin16` varchar(75) DEFAULT NULL, `lin17` varchar(75) DEFAULT NULL, `lin18` varchar(75) DEFAULT NULL, `lin19` varchar(75) DEFAULT NULL, `lin20` varchar(75) DEFAULT NULL, `lin21` varchar(75) DEFAULT NULL, `lin22` varchar(75) DEFAULT NULL, `lin23` varchar(75) DEFAULT NULL, `chg01` varchar(4) DEFAULT NULL, `chg02` varchar(4) DEFAULT NULL, `chg03` varchar(4) DEFAULT NULL, `chg04` varchar(4) DEFAULT NULL, `chg05` varchar(4) DEFAULT NULL, `chg06` varchar(4) DEFAULT NULL, `chg07` varchar(4) DEFAULT NULL, `chg08` varchar(4) DEFAULT NULL, `chg09` varchar(4) DEFAULT NULL, `chg10` varchar(4) DEFAULT NULL, `chg11` varchar(4) DEFAULT NULL, `chg12` varchar(4) DEFAULT NULL, `amt01` decimal(10,2) DEFAULT NULL, `amt02` decimal(10,2) DEFAULT NULL, `amt03` decimal(10,2) DEFAULT NULL, `amt04` decimal(10,2) DEFAULT NULL, `amt05` decimal(10,2) DEFAULT NULL, `amt06` decimal(10,2) DEFAULT NULL, `amt07` decimal(10,2) DEFAULT NULL, `amt08` decimal(10,2) DEFAULT NULL, `amt09` decimal(10,2) DEFAULT NULL, `amt10` decimal(10,2) DEFAULT NULL, `amt11` decimal(10,2) DEFAULT NULL, `amt12` decimal(10,2) DEFAULT NULL, `gln01` varchar(8) DEFAULT NULL, `gln02` varchar(8) DEFAULT NULL, `gln03` varchar(8) DEFAULT NULL, `gln04` varchar(8) DEFAULT NULL, `gln05` varchar(8) DEFAULT NULL, `gln06` varchar(8) DEFAULT NULL, `gln07` varchar(8) DEFAULT NULL, `gln08` varchar(8) DEFAULT NULL, `gln09` varchar(8) DEFAULT NULL, `gln10` varchar(8) DEFAULT NULL, `gln11` varchar(8) DEFAULT NULL, `gln12` varchar(8) DEFAULT NULL, `totpgs` int(3) DEFAULT NULL, `prttot` varchar(1) DEFAULT NULL, `prtnum` varchar(3) DEFAULT NULL, PRIMARY KEY (`cmpnum`,`trmnum`,`invnum`,`cntnum`,`pagnum`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1
[31 Jul 2012 13:08]
Roy Lyseng
Thank you for the schema information. With MySQL 5.6.5-m8 I get this plan: | 1 | SIMPLE | i2 | index | PRIMARY,invdrs_idx1 | invdrs_idx1 | 10 | NULL | 3 | Using where; Using temporary; LooseScan | | 1 | SIMPLE | invdrs | index | PRIMARY,invdrs_idx1 | PRIMARY | 30 | NULL | 3 | Using where; Using index; Using join buffer (Block Nested Loop) | | 1 | SIMPLE | invoic | ref | PRIMARY | PRIMARY | 20 | test.invdrs.cmpnum,test.invdrs.trmnum,test.invdrs.invnum,test.invdrs.cntnum | 1 | NULL | There are only 3 rows per table, so your plan may differ. I do not understand how you get a plan that shows DEPENDENT SUBQUERY. Is semi-join enabled for the database?