Bug #97979 | MySQL Feature Request: hash joins should be able to use an index | ||
---|---|---|---|
Submitted: | 13 Dec 2019 5:15 | Modified: | 12 Feb 2020 0:50 |
Reporter: | 희윤 윤 | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | 8.0.18 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | hash, hash join, hashjoin, parallel |
[13 Dec 2019 5:15]
희윤 윤
[16 Dec 2019 11:52]
Kenny Gryp
Please make a separate request for each request/bug you want to bring to our attention. I will change this issue into the 'Hash Join' request. Please create a separate bug for the performance issue/difference you observed with innodb parallel read threads.
[16 Dec 2019 11:54]
Kenny Gryp
Here's a new description of the issue, limiting to the hash join request # Description: In version 8.0.18 we tested the function of hash join. But hash join can only be done when doing full table scan. Nested loop joins with indexes were better than hash joins. # How to repeat: __missing__ # Suggested fix: Add support to Hash Joins to support indexes
[16 Dec 2019 11:56]
Kenny Gryp
Hello 희윤 윤: Can you give the example of your queries where nested loop was faster than hash joins? Please provide all the steps (with it's output) you took to reproduce this behavior.
[18 Dec 2019 0:44]
희윤 윤
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | EXPLAIN | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | -> Inner hash join (i.ITEM_ID = b.ITEM_ID) (cost=329069752927261.88 rows=79082838) -> Table scan on I (cost=4.17 rows=42102214) -> Hash -> Index scan on B using DISP_CTG_ITEM_IX03 (cost=8301335.84 rows=78159647) | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT /*+ HASH_JOIN(I,B) */ COUNT(*) FROM ITEM I IGNORE INDEX(PRIMARY) JOIN DISP_CTG_ITEM B IGNORE INDEX( DISP_CTG_ITEM_IX04, DISP_CTG_ITEM_IX02) ON I.ITEM_ID = B.ITEM_ID; +----------+ | COUNT(*) | +----------+ | 71839152 | +----------+ 1 row in set (8 min 6.89 sec) mysql> explain format=tree SELECT COUNT(*) FROM ITEM I JOIN DISP_CTG_ITEM B ON I.ITEM_ID = B.ITEM_ID; +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | EXPLAIN | +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | -> Aggregate: count(0) -> Nested loop inner join (cost=47335893.88 rows=107230050) -> Index scan on I using PRIMARY (cost=4742342.98 rows=42102214) -> Index lookup on B using DISP_CTG_ITEM_IX04 (ITEM_ID=i.ITEM_ID) (cost=0.76 rows=3) | +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT COUNT(*) FROM ITEM I JOIN DISP_CTG_ITEM B ON I.ITEM_ID = B.ITEM_ID; +----------+ | COUNT(*) | +----------+ | 71839152 | +----------+ 1 row in set (5 min 44.29 sec)
[18 Dec 2019 15:19]
MySQL Verification Team
Hi, Thank you for your bug report. Can you prove your claim about the inefficiency of hash joins by some solid benchmarking ????
[23 Dec 2019 6:15]
희윤 윤
hi Sinisa Milivojevic I left a SQL plan and execution result in the previous comment.
[7 Jan 2020 13:41]
MySQL Verification Team
Hello 희윤 윤: Can you give the example of your queries where nested loop was faster than hash joins? Please provide all the steps (with it's output) you took to reproduce this behavior. We have taken a look at what you have sent us and did not find a significant difference in the speed. Optimiser can not predict small differences in the execution speed, which is why we have optimiser switches, options and hints ..... So far, I do not see any bug here. Please do pay attention to what our Kenny Grip has written in his comments.
[8 Feb 2020 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[11 Feb 2020 0:16]
희윤 윤
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | EXPLAIN | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | -> Inner hash join (i.ITEM_ID = b.ITEM_ID) (cost=329069752927261.88 rows=79082838) -> Table scan on I (cost=4.17 rows=42102214) -> Hash -> Index scan on B using DISP_CTG_ITEM_IX03 (cost=8301335.84 rows=78159647) | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT /*+ HASH_JOIN(I,B) */ COUNT(*) FROM ITEM I IGNORE INDEX(PRIMARY) JOIN DISP_CTG_ITEM B IGNORE INDEX( DISP_CTG_ITEM_IX04, DISP_CTG_ITEM_IX02) ON I.ITEM_ID = B.ITEM_ID; +----------+ | COUNT(*) | +----------+ | 71839152 | +----------+ 1 row in set (8 min 6.89 sec) mysql> explain format=tree SELECT COUNT(*) FROM ITEM I JOIN DISP_CTG_ITEM B ON I.ITEM_ID = B.ITEM_ID; +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | EXPLAIN | +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | -> Aggregate: count(0) -> Nested loop inner join (cost=47335893.88 rows=107230050) -> Index scan on I using PRIMARY (cost=4742342.98 rows=42102214) -> Index lookup on B using DISP_CTG_ITEM_IX04 (ITEM_ID=i.ITEM_ID) (cost=0.76 rows=3) | +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT COUNT(*) FROM ITEM I JOIN DISP_CTG_ITEM B ON I.ITEM_ID = B.ITEM_ID; +----------+ | COUNT(*) | +----------+ | 71839152 | +----------+ 1 row in set (5 min 44.29 sec)
[11 Feb 2020 13:24]
MySQL Verification Team
Hi, Due to a bug in 8.0.18, can you please use only table format in your explain output ??? Thank you very much in advance,
[12 Feb 2020 0:50]
희윤 윤
CREATE TABLE `item` ( `ITEM_ID` varchar(13) NOT NULL , `ITEM_REG_DIV_CD` varchar(2) DEFAULT NULL , `SPL_VEN_ID` varchar(10) DEFAULT NULL , `LRNK_SPL_VEN_ID` varchar(10) DEFAULT NULL , `BRAND_ID` varchar(10) DEFAULT NULL , `EXUS_ITEM_DIV_CD` varchar(2) DEFAULT NULL , `EXUS_ITEM_DTL_CD` varchar(2) DEFAULT NULL , `WHOUT_SHPPCST_ID` varchar(10) DEFAULT NULL , `FRG_SHPP_PSBL_YN` varchar(1) DEFAULT NULL , `FRG_SHPP_ITEMGR_WGT_CRITN_ID` varchar(10) DEFAULT NULL , `ITEM_TOT_WGT` decimal(10,2) DEFAULT NULL , `L_CAPA_YN` varchar(1) DEFAULT NULL , `SHPP_MAIN_CD` varchar(6) DEFAULT NULL , `SHPP_MTHD_CD` varchar(6) DEFAULT NULL , `DELI_PSBL_YN` varchar(1) DEFAULT NULL , `STD_CTG_ID` varchar(10) DEFAULT NULL , `ITEM_SELL_TYPE_CD` varchar(2) DEFAULT NULL , `ITEM_SELL_TYPE_DTL_CD` varchar(2) DEFAULT NULL , `SPE_SALESTR_NO` varchar(4) DEFAULT NULL , `CPN_APL_EXCL_YN` varchar(1) DEFAULT NULL , `UNCPN_APL_EXCL_YN` varchar(1) DEFAULT NULL , `B2E_PRC_MNG_MTHD_CD` varchar(2) DEFAULT NULL , `ITEM_REG_DTL_CD` varchar(2) DEFAULT NULL , `WHOUT_ADDR_ID` varchar(10) DEFAULT NULL , `SNBK_ADDR_ID` varchar(10) DEFAULT NULL , `RET_SHPPCST_ID` varchar(10) DEFAULT NULL , `EXCH_SHPPCST_ID` varchar(10) DEFAULT NULL , `ITEM_CHRCT_DIV_CD` varchar(2) DEFAULT NULL , `ISMTAR_ADD_SHPPCST_ID` varchar(10) DEFAULT NULL , `JEJU_ADD_SHPPCST_ID` varchar(10) DEFAULT NULL , `TXN_DIV_CD` varchar(6) DEFAULT NULL , `PERDC_SHPP_TGT_YN` varchar(1) DEFAULT NULL , PRIMARY KEY (`ITEM_ID`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ; CREATE TABLE `disp_ctg_item` ( `DISP_CTG_ID` varchar(10) NOT NULL , `ITEM_ID` varchar(13) NOT NULL , `DISP_ORDR` decimal(8,0) DEFAULT NULL , `REP_DISP_ORDR` decimal(8,0) DEFAULT NULL , PRIMARY KEY (`DISP_CTG_ID`,`ITEM_ID`), KEY `DISP_CTG_ITEM_IX02` (`ITEM_ID`,`DISP_CTG_ID`), KEY `DISP_CTG_ITEM_IX03` (`DISP_CTG_ID`,`DISP_ORDR`,`ITEM_ID`), KEY `DISP_CTG_ITEM_IX04` (`ITEM_ID`,`REP_DISP_ORDR`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ;
[12 Feb 2020 13:06]
MySQL Verification Team
Thank you. However, what we asked for is to have EXPLAIN in the table format. We did not ask for table definitions. Thanks in advance.