Description:
I have an SQL statement. When in and = are combined with a subquery, the result sets returned by them are inconsistent.
1、SQL
SELECT
al.object_id3 AS product_id,
al.object_id1 AS domain_id,
DATE(al.created_dt) AS view_date,
COUNT(al.object_id3) AS count
FROM activitylog AS al
LEFT JOIN domaincompanymap AS dcm
ON dcm.company_id = al.login_company_id
AND dcm.domain_id = al.login_domain_id
WHERE al.activitylogkey_id = (select id from activitylogkey where name = 'PRODUCT_VIEW')
AND dcm.account_type_lookup_id = 201
AND al.created_dt >= '2024-03-17'
AND al.object_id3 IS NOT NULL
GROUP BY al.object_id3, view_date;
2、use = before the subquery
SELECT
al.object_id3 AS product_id,
al.object_id1 AS domain_id,
DATE(al.created_dt) AS view_date,
COUNT(al.object_id3) AS count
FROM activitylog AS al
LEFT JOIN domaincompanymap AS dcm
ON dcm.company_id = al.login_company_id
AND dcm.domain_id = al.login_domain_id
WHERE al.activitylogkey_id = (select id from activitylogkey where name = 'PRODUCT_VIEW')
AND dcm.account_type_lookup_id = 201
AND al.created_dt >= '2024-03-17'
AND al.object_id3 IS NOT NULL
GROUP BY al.object_id3, view_date;
The result set I got is:
| 708911 | 933 | 2024-03-21 | 1 |
| 683087 | 933 | 2024-03-21 | 2 |
| 770104 | 933 | 2024-03-21 | 1 |
+------------+-----------+------------+-------+
148 rows in set (1.35 sec)
explain analyze
-> Table scan on <temporary> (actual time=0.002..0.020 rows=148 loops=1)
-> Aggregate using temporary table (actual time=178.123..178.151 rows=148 loops=1)
-> Nested loop inner join (cost=41739.14 rows=788) (actual time=0.370..177.736 rows=187 loops=1)
-> Filter: ((al.activitylogkey_id = (select #2)) and (al.object_id3 is not null)) (cost=38982.75 rows=7875) (actual time=0.070..148.740 rows=11812 loops=1)
-> Index range scan on al using activitylog_createddt_nix over ('2024-03-17 00:00:00' <= created_dt), with index condition: (al.created_dt >= TIMESTAMP'2024-03-17 00:00:00') (cost=38982.75 rows=86608) (actual time=0.024..142.879 rows=42323 loops=1)
-> Select #2 (subquery in condition; run only once)
-> Filter: (activitylogkey.`name` = 'PRODUCT_VIEW') (cost=0.35 rows=1) (actual time=0.024..0.026 rows=1 loops=1)
-> Index lookup on activitylogkey using activitylogkey_name_nix (name='PRODUCT_VIEW') (cost=0.35 rows=1) (actual time=0.022..0.024 rows=1 loops=1)
-> Filter: (dcm.account_type_lookup_id = 201) (cost=0.25 rows=0) (actual time=0.002..0.002 rows=0 loops=11812)
-> Single-row index lookup on dcm using domaincompanymap_domainid_companyid_uix (domain_id=al.login_domain_id, company_id=al.login_company_id), with index condition: (dcm.domain_id = al.login_domain_id) (cost=0.25 rows=1) (actual time=0.002..0.002 rows=0 loops=11812)
|
3、use in before the subquery
SELECT
al.object_id3 AS product_id,
al.object_id1 AS domain_id,
DATE(al.created_dt) AS view_date,
COUNT(al.object_id3) AS count
FROM activitylog AS al
LEFT JOIN domaincompanymap AS dcm
ON dcm.company_id = al.login_company_id
AND dcm.domain_id = al.login_domain_id
WHERE al.activitylogkey_id in (select id from activitylogkey where name = 'PRODUCT_VIEW')
AND dcm.account_type_lookup_id = 201
AND al.created_dt >= '2024-03-17'
AND al.object_id3 IS NOT NULL
GROUP BY al.object_id3, view_date;
The result set I got is:
| 748985 | 934 | 2024-03-20 | 1 |
| 748984 | 934 | 2024-03-20 | 1 |
| 709166 | 933 | 2024-03-20 | 1 |
+------------+-----------+------------+-------+
148 rows in set (10.54 sec)
explain analyze
| -> Table scan on <temporary> (actual time=0.002..0.020 rows=148 loops=1)
-> Aggregate using temporary table (actual time=9574.275..9574.303 rows=148 loops=1)
-> Nested loop inner join (cost=688.90 rows=0) (actual time=5912.164..9573.831 rows=187 loops=1)
-> Nested loop inner join (cost=687.75 rows=3) (actual time=5744.669..9544.705 rows=11812 loops=1)
-> Filter: (activitylogkey.`name` = 'PRODUCT_VIEW') (cost=0.35 rows=1) (actual time=0.023..0.029 rows=1 loops=1)
-> Index lookup on activitylogkey using activitylogkey_name_nix (name='PRODUCT_VIEW') (cost=0.35 rows=1) (actual time=0.021..0.027 rows=1 loops=1)
-> Filter: ((al.created_dt >= TIMESTAMP'2024-03-17 00:00:00') and (al.object_id3 is not null)) (cost=491.38 rows=3) (actual time=5744.645..9543.856 rows=11812 loops=1)
-> Index lookup on al using activitylog_typelogincompanyloginuser_nix (activitylogkey_id=activitylogkey.id), with index condition: (al.activitylogkey_id = activitylogkey.id) (cost=491.38 rows=1963) (actual time=0.340..8495.810 rows=2617714 loops=1)
-> Filter: (dcm.account_type_lookup_id = 201) (cost=0.25 rows=0) (actual time=0.002..0.002 rows=0 loops=11812)
-> Single-row index lookup on dcm using domaincompanymap_domainid_companyid_uix (domain_id=al.login_domain_id, company_id=al.login_company_id), with index condition: (dcm.domain_id = al.login_domain_id) (cost=0.25 rows=1) (actual time=0.002..0.002 rows=0 loops=11812)
|
4、subquery only 1 row
mysql> select id from activitylogkey where name = 'PRODUCT_VIEW';
+----+
| id |
+----+
| 84 |
+----+
1 row in set (0.22 sec)
5、CTE rewrite
with q1 as
(SELECT al.object_id3 AS product_id,
al.object_id1 AS domain_id,
DATE(al.created_dt) AS view_date,
COUNT(al.object_id3) AS coun,
al.activitylogkey_id as activitylogkey_id
FROM activitylog AS al
LEFT JOIN domaincompanymap AS dcm
ON dcm.company_id = al.login_company_id
AND dcm.domain_id = al.login_domain_id
WHERE
dcm.account_type_lookup_id = 201
AND al.created_dt >= '2024-01-17'
AND al.object_id3 IS NOT NULL
GROUP BY al.object_id3, view_date),
q2 as (select id from activitylogkey where name = 'PRODUCT_VIEW')
select q1.product_id,q1.domain_id,q1.coun from q1 where q1.activitylogkey_id in (select id from q2);
use in
| 708911 | 933 | 1 |
| 683087 | 933 | 2 |
| 770104 | 933 | 1 |
+------------+-----------+------+
148 rows in set (0.41 sec)
use =
| 708911 | 933 | 1 |
| 683087 | 933 | 2 |
| 770104 | 933 | 1 |
+------------+-----------+------+
148 rows in set (0.40 sec)
How to repeat:
There is no way for the time being.
Suggested fix:
There is no way for the time being.
Description: I have an SQL statement. When in and = are combined with a subquery, the result sets returned by them are inconsistent. 1、SQL SELECT al.object_id3 AS product_id, al.object_id1 AS domain_id, DATE(al.created_dt) AS view_date, COUNT(al.object_id3) AS count FROM activitylog AS al LEFT JOIN domaincompanymap AS dcm ON dcm.company_id = al.login_company_id AND dcm.domain_id = al.login_domain_id WHERE al.activitylogkey_id = (select id from activitylogkey where name = 'PRODUCT_VIEW') AND dcm.account_type_lookup_id = 201 AND al.created_dt >= '2024-03-17' AND al.object_id3 IS NOT NULL GROUP BY al.object_id3, view_date; 2、use = before the subquery SELECT al.object_id3 AS product_id, al.object_id1 AS domain_id, DATE(al.created_dt) AS view_date, COUNT(al.object_id3) AS count FROM activitylog AS al LEFT JOIN domaincompanymap AS dcm ON dcm.company_id = al.login_company_id AND dcm.domain_id = al.login_domain_id WHERE al.activitylogkey_id = (select id from activitylogkey where name = 'PRODUCT_VIEW') AND dcm.account_type_lookup_id = 201 AND al.created_dt >= '2024-03-17' AND al.object_id3 IS NOT NULL GROUP BY al.object_id3, view_date; The result set I got is: | 708911 | 933 | 2024-03-21 | 1 | | 683087 | 933 | 2024-03-21 | 2 | | 770104 | 933 | 2024-03-21 | 1 | +------------+-----------+------------+-------+ 148 rows in set (1.35 sec) explain analyze -> Table scan on <temporary> (actual time=0.002..0.020 rows=148 loops=1) -> Aggregate using temporary table (actual time=178.123..178.151 rows=148 loops=1) -> Nested loop inner join (cost=41739.14 rows=788) (actual time=0.370..177.736 rows=187 loops=1) -> Filter: ((al.activitylogkey_id = (select #2)) and (al.object_id3 is not null)) (cost=38982.75 rows=7875) (actual time=0.070..148.740 rows=11812 loops=1) -> Index range scan on al using activitylog_createddt_nix over ('2024-03-17 00:00:00' <= created_dt), with index condition: (al.created_dt >= TIMESTAMP'2024-03-17 00:00:00') (cost=38982.75 rows=86608) (actual time=0.024..142.879 rows=42323 loops=1) -> Select #2 (subquery in condition; run only once) -> Filter: (activitylogkey.`name` = 'PRODUCT_VIEW') (cost=0.35 rows=1) (actual time=0.024..0.026 rows=1 loops=1) -> Index lookup on activitylogkey using activitylogkey_name_nix (name='PRODUCT_VIEW') (cost=0.35 rows=1) (actual time=0.022..0.024 rows=1 loops=1) -> Filter: (dcm.account_type_lookup_id = 201) (cost=0.25 rows=0) (actual time=0.002..0.002 rows=0 loops=11812) -> Single-row index lookup on dcm using domaincompanymap_domainid_companyid_uix (domain_id=al.login_domain_id, company_id=al.login_company_id), with index condition: (dcm.domain_id = al.login_domain_id) (cost=0.25 rows=1) (actual time=0.002..0.002 rows=0 loops=11812) | 3、use in before the subquery SELECT al.object_id3 AS product_id, al.object_id1 AS domain_id, DATE(al.created_dt) AS view_date, COUNT(al.object_id3) AS count FROM activitylog AS al LEFT JOIN domaincompanymap AS dcm ON dcm.company_id = al.login_company_id AND dcm.domain_id = al.login_domain_id WHERE al.activitylogkey_id in (select id from activitylogkey where name = 'PRODUCT_VIEW') AND dcm.account_type_lookup_id = 201 AND al.created_dt >= '2024-03-17' AND al.object_id3 IS NOT NULL GROUP BY al.object_id3, view_date; The result set I got is: | 748985 | 934 | 2024-03-20 | 1 | | 748984 | 934 | 2024-03-20 | 1 | | 709166 | 933 | 2024-03-20 | 1 | +------------+-----------+------------+-------+ 148 rows in set (10.54 sec) explain analyze | -> Table scan on <temporary> (actual time=0.002..0.020 rows=148 loops=1) -> Aggregate using temporary table (actual time=9574.275..9574.303 rows=148 loops=1) -> Nested loop inner join (cost=688.90 rows=0) (actual time=5912.164..9573.831 rows=187 loops=1) -> Nested loop inner join (cost=687.75 rows=3) (actual time=5744.669..9544.705 rows=11812 loops=1) -> Filter: (activitylogkey.`name` = 'PRODUCT_VIEW') (cost=0.35 rows=1) (actual time=0.023..0.029 rows=1 loops=1) -> Index lookup on activitylogkey using activitylogkey_name_nix (name='PRODUCT_VIEW') (cost=0.35 rows=1) (actual time=0.021..0.027 rows=1 loops=1) -> Filter: ((al.created_dt >= TIMESTAMP'2024-03-17 00:00:00') and (al.object_id3 is not null)) (cost=491.38 rows=3) (actual time=5744.645..9543.856 rows=11812 loops=1) -> Index lookup on al using activitylog_typelogincompanyloginuser_nix (activitylogkey_id=activitylogkey.id), with index condition: (al.activitylogkey_id = activitylogkey.id) (cost=491.38 rows=1963) (actual time=0.340..8495.810 rows=2617714 loops=1) -> Filter: (dcm.account_type_lookup_id = 201) (cost=0.25 rows=0) (actual time=0.002..0.002 rows=0 loops=11812) -> Single-row index lookup on dcm using domaincompanymap_domainid_companyid_uix (domain_id=al.login_domain_id, company_id=al.login_company_id), with index condition: (dcm.domain_id = al.login_domain_id) (cost=0.25 rows=1) (actual time=0.002..0.002 rows=0 loops=11812) | 4、subquery only 1 row mysql> select id from activitylogkey where name = 'PRODUCT_VIEW'; +----+ | id | +----+ | 84 | +----+ 1 row in set (0.22 sec) 5、CTE rewrite with q1 as (SELECT al.object_id3 AS product_id, al.object_id1 AS domain_id, DATE(al.created_dt) AS view_date, COUNT(al.object_id3) AS coun, al.activitylogkey_id as activitylogkey_id FROM activitylog AS al LEFT JOIN domaincompanymap AS dcm ON dcm.company_id = al.login_company_id AND dcm.domain_id = al.login_domain_id WHERE dcm.account_type_lookup_id = 201 AND al.created_dt >= '2024-01-17' AND al.object_id3 IS NOT NULL GROUP BY al.object_id3, view_date), q2 as (select id from activitylogkey where name = 'PRODUCT_VIEW') select q1.product_id,q1.domain_id,q1.coun from q1 where q1.activitylogkey_id in (select id from q2); use in | 708911 | 933 | 1 | | 683087 | 933 | 2 | | 770104 | 933 | 1 | +------------+-----------+------+ 148 rows in set (0.41 sec) use = | 708911 | 933 | 1 | | 683087 | 933 | 2 | | 770104 | 933 | 1 | +------------+-----------+------+ 148 rows in set (0.40 sec) How to repeat: There is no way for the time being. Suggested fix: There is no way for the time being.