Bug #99244 | bad performance in hash join because join with no join condition between tables | ||
---|---|---|---|
Submitted: | 14 Apr 2020 3:22 | Modified: | 21 Apr 2020 16:24 |
Reporter: | chunyang xu | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S1 (Critical) |
Version: | 8.0.19 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | bad performance, Cartesian Product temp result., hash join |
[14 Apr 2020 3:22]
chunyang xu
[14 Apr 2020 13:15]
MySQL Verification Team
Hi Mr. xu, Thank you for your bug report. I have run your test case, in several different EXPLAIN formats. It is clear to me that you are not actually correct. In the first case, we have a linear relationship T1 -> T2 -> T3. This relationship had to change, since you have added rows to T2, which has no indices. That has changed a relationship from a linear one to the hierarchical one. Hence, what you have now is T2 ->T3 and T2->T1, without any interaction between T1 and T3. Looks something like this: T2 / \ T3 T1 So, everything is fine and this is not a bug.
[17 Apr 2020 13:29]
chunyang xu
hell , if you will, you can do a lab like this . insert into t1 values('a0','name0','addr0'); insert into t1 values('a1','name1','addr1'); insert into t1 values('a2','name2','addr2'); ......... insert into t1 values('a998','name998','addr998'); insert into t1 values('a999','name999','addr999'); T1 table has 1000 records. then insert recort into t2, 50000 records. insert into t2 values('a0','name0','addr0'); insert into t2 values('a1','name1','addr1'); insert into t2 values('a2','name2','addr2'); insert into t2 values('a3','name3','addr3'); insert into t2 values('a4','name4','addr4'); .................................... ...................... insert into t2 values('a49997','name49997','addr49997'); insert into t2 values('a49998','name49998','addr49998'); insert into t2 values('a49999','name49999','addr49999'); and then for T3, total 4000 record. insert into t3 values('a0','name0','addr0'); insert into t3 values('a1','name1','addr1'); insert into t3 values('a2','name2','addr2'); insert into t3 values('a3','name3','addr3'); insert into t3 values('a4','name4','addr4'); insert into t3 values('a5','name5','addr5'); .................................. ............................ insert into t3 values('a3997','name3997','addr3997'); insert into t3 values('a3998','name3998','addr3998'); insert into t3 values('a3999','name3999','addr3999'); then test the following SQL first SQL select count(*) from t1 straight_join t2 straight_join t3 where t1_id=t2_id and t2_addr=t3_addr ; mysql> select count(*) from t1 straight_join t2 straight_join t3 where t1_id=t2_id and t2_addr=t3_addr ; +----------+ | count(*) | +----------+ | 1021 | +----------+ 1 row in set (1.57 sec) secord SQL select count(*) from t1 , t2 ,t3 where t1_id=t2_id and t2_addr=t3_addr ; must wait a long time before the result come out.
[21 Apr 2020 16:24]
MySQL Verification Team
Here is a testcase: -- drop table if exists t1,t2,t3; create table t1 ( t1_id varchar(20),t1_name varchar(20) ,t1_addr varchar(20)); create table t2 ( t2_id varchar(20),t2_name varchar(20) ,t2_addr varchar(20)); create table t3 ( t3_id varchar(20),t3_name varchar(20) ,t3_addr varchar(20)); insert into t1 values('a0','name0','addr0'),('a1','name1','addr1'),('a2','name2','addr2'); set @n:=3; insert into t1 select concat('a',@n:=@n+1),concat('name',@n),concat('addr',@n) from t1 a,t1 b,t1 c,t1 d,t1 e; insert into t1 select concat('a',@n:=@n+1),concat('name',@n),concat('addr',@n) from t1 a,t1 b,t1 c limit 9150; select count(*) from t1; insert into t2 values('a0','name0','addr0'),('a1','name1','addr1'),('a2','name2','addr2'); set @n:=3; insert into t2 select concat('a',@n:=@n+1),concat('name',@n),concat('addr',@n) from t2 a,t2 b,t2 c,t2 d,t2 e; insert into t2 select concat('a',@n:=@n+1),concat('name',@n),concat('addr',@n) from t2 a,t2 b,t2 c limit 497540; select count(*) from t2; insert into t3 values('a0','name0','addr0'),('a1','name1','addr1'),('a2','name2','addr2'); set @n:=3; insert into t3 select concat('a',@n:=@n+1),concat('name',@n),concat('addr',@n) from t3 a,t3 b,t3 c,t3 d,t3 e; insert into t3 select concat('a',@n:=@n+1),concat('name',@n),concat('addr',@n) from t3 a,t3 b,t3 c limit 37540; select count(*) from t3; analyze table t1,t2,t3; explain select count(*) from t1 straight_join t2 straight_join t3 where t1_id=t2_id and t2_addr=t3_addr ; explain analyze select count(*) from t1 straight_join t2 straight_join t3 where t1_id=t2_id and t2_addr=t3_addr ; flush status; select count(*) from t1 straight_join t2 straight_join t3 where t1_id=t2_id and t2_addr=t3_addr ; show status like 'handler%'; explain select count(*) from t1 , t2 ,t3 where t1_id=t2_id and t2_addr=t3_addr ; explain analyze select count(*) from t1 , t2 ,t3 where t1_id=t2_id and t2_addr=t3_addr ; flush status; select count(*) from t1 , t2 ,t3 where t1_id=t2_id and t2_addr=t3_addr ; show status like 'handler%'; --- I will attach the outputs shortly.
[21 Apr 2020 16:46]
MySQL Verification Team
output of testcase on 8.0.19
Attachment: o.txt (text/plain), 8.61 KiB.
[21 Apr 2020 17:23]
MySQL Verification Team
what the "perf" looks like for the explain analyze and select queries
Attachment: perf_profile_8.0.19.txt (text/plain), 18.31 KiB.