Bug #100483 engine condition pushdown is not indicated in EXPLAIN
Submitted: 10 Aug 2020 11:49 Modified: 14 Aug 2020 11:07
Reporter: Justin Swanhart Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.21 OS:Any
Assigned to: CPU Architecture:Any

[10 Aug 2020 11:49] Justin Swanhart
Description:
mysql> explain select count(*) from lineorder lo join dim_date dd on d_datekey = lo_orderdatekey  where d_year = 1993 and LO_discount between 1 and 3 and LO_quantity < 25\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: dd
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2556
     filtered: 10.00
        Extra: NULL
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: lo
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 6001215
     filtered: 0.37
        Extra: Using where; Using join buffer (hash join)
2 rows in set, 1 warning (0.01 sec)

This query on my storage engine (WARP) uses ECP for both tables.  It should be noted in EXPLAIN for dd:
Using engine condition

and for lo:
Using engine condition; Using join buffer (hash join)

note there are no conditions pushed up to the storage engine for lo except the condition for the join (which is using hash join) so maybe it should say "Using where" since it evaluates the join condition in the hash join?

ECP can result in unpushed conditions.  For example for query:
select * from table where x = 1 and lower(y) = 'abc' my engine will use ECP for x = 1 but push lower(y) to the SE layer, so it should say "Using engine condition; Using where" when conditions are not pushed.

How to repeat:
Use a storage engine with ECP and execute a query that utilizes it and note that usage of ECP is not noted in the EXPLAIN plan.

Suggested fix:
Note ECP usage in EXPLAIN plan.
[10 Aug 2020 11:51] Justin Swanhart
should have said lower(y) is pushed to SQL layer not SE layer.
[10 Aug 2020 12:27] Justin Swanhart
ECP isn't indicated in EXPLAIN ANALYZE either.  Note that actual rows != estimated rows which means ECP worked.
mysql> explain analyze select sum(lo_extendedprice * lo_discount) as revenue from lineorder where lo_orderdatekey between 19930101 and 19931231 and LO_discount between 1 and 3 and LO_quantity < 25\G
*************************** 1. row ***************************
EXPLAIN: -> Aggregate: sum((lineorder.LO_ExtendedPrice * lineorder.LO_Discount))  (actual time=1129.869..1129.869 rows=1 loops=1)
    -> Table scan on lineorder  (cost=600121.50 rows=6001215) (actual time=771.840..1104.603 rows=118598 loops=1)
1 row in set (1.13 sec)

-- no ECP
mysql> explain analyze select sum(lo_extendedprice * lo_discount) as revenue from lineorder where lo_orderdatekey between 19930101 and 19931231 and LO_discount between 1 and 3 and LO_quantity < 25\G
*************************** 1. row ***************************
EXPLAIN: -> Aggregate: sum((lineorder.LO_ExtendedPrice * lineorder.LO_Discount))  (actual time=12232.671..12232.671 rows=1 loops=1)
    -> Filter: ((lineorder.LO_OrderDateKey between 19930101 and 19931231) and (lineorder.LO_Discount between 1 and 3) and (lineorder.LO_Quantity < 25))  (cost=600121.50 rows=24689) (actual time=553.962..12199.761 rows=118598 loops=1)
        -> Table scan on lineorder  (cost=600121.50 rows=6001215) (actual time=553.940..11116.409 rows=6001215 loops=1)
1 row in set (12.25 sec)
[10 Aug 2020 12:29] Justin Swanhart
Well, I guess the lack of ->filter indicates that ECP was used...
[10 Aug 2020 12:45] MySQL Verification Team
Hi Justin,

Thank you for your bug report.

Let us see if we understand you correctly. The only problem that you see is that SE condition pushed down is used, but the EXPLAIN does not show it in its output. Are we correct ????

We also have another question. As you probably know, we need to repeat the behaviour in order to verify the bug. So, it is our understanding that this report of yours serves mainly a purpose of discovering whether you have missed to call some SE plugin API functionality.

Are we correct ?????
[10 Aug 2020 13:03] Justin Swanhart
No call is missing in the SE layer.  If you use NDB I don't think it will indicate ECP is used.

Note that when InnoDB uses ICP, it notes it in EXPLAIN ANALYZE
-- with ICP
mysql> explain analyze select sum(lo_extendedprice * lo_discount) as revenue from lineorderi force index (lo_orderdatekey_2) where lo_orderdatekey between 19930101 and 19931231 and LO_discount between 1 and 3 and LO_quantity < 25\G
*************************** 1. row ***************************
EXPLAIN: -> Aggregate: sum((lineorderi.LO_ExtendedPrice * lineorderi.LO_Discount))  (actual time=595.407..595.408 rows=1 loops=1)
    -> Index range scan on lineorderi using LO_OrderDateKey_2, with index condition: ((lineorderi.LO_OrderDateKey between 19930101 and 19931231) and (lineorderi.LO_Discount between 1 and 3) and (lineorderi.LO_Quantity < 25))  (cost=831924.26 rows=1848720) (actual time=0.027..563.768 rows=118598 loops=1)
1 row in set (0.60 sec)

note that with ICP, the Index range scan includes the info:
with index condition: ((lineorderi.LO_OrderDateKey between 19930101 and 19931231) and (lineorderi.LO_Discount between 1 and 3) and (lineorderi.LO_Quantity < 25)) 

but explain analyze for a query that does ECP doesn't attach such an note to the table scan:
mysql> explain analyze select sum(lo_extendedprice * lo_discount) as revenue from lineorder where lo_orderdatekey between 19930101 and 19931231 and LO_discount between 1 and 3 and LO_quantity < 25\G
*************************** 1. row ***************************
EXPLAIN: -> Aggregate: sum((lineorder.LO_ExtendedPrice * lineorder.LO_Discount))  (actual time=443.650..443.650 rows=1 loops=1)
    -> Table scan on lineorder  (cost=600121.50 rows=6001215) (actual time=89.659..418.449 rows=118598 loops=1)
1 row in set (0.45 sec)

I would like to see
-> Table scan on lineorder, with engine condition: ((lineorderi.LO_OrderDateKey between 19930101 and 19931231) and (lineorderi.LO_Discount between 1 and 3) and (lineorderi.LO_Quantity < 25))
[10 Aug 2020 13:04] Justin Swanhart
I copy/pasted the condition information from ICP.  it should say "lineorder." not "lineorderi."  :)
[10 Aug 2020 13:07] Justin Swanhart
Note that unpushed conditions still show up as a filter, so ECP is working correctly and is being handled by the SE properly:
mysql> explain analyze select sum(lo_extendedprice * lo_discount) as revenue from lineorder where lo_orderdatekey between 19930101 and 19931231 and LO_discount between 1 and 3 and LO_quantity < 25 and lower(LO_Shipmode) = '0'\G
*************************** 1. row ***************************
EXPLAIN: -> Aggregate: sum((lineorder.LO_ExtendedPrice * lineorder.LO_Discount))  (actual time=792.526..792.526 rows=1 loops=1)
    -> Filter: (lower(lineorder.LO_ShipMode) = '0')  (cost=600121.50 rows=24689) (actual time=792.523..792.523 rows=0 loops=1)
        -> Table scan on lineorder  (cost=600121.50 rows=6001215) (actual time=230.478..760.516 rows=118598 loops=1)
1 row in set (0.80 sec)

The unpushed condition is attached to the filter, but the pushed conditions are not attached to the table scan .  I would like pushed conditions attached to table scans.
[10 Aug 2020 13:18] MySQL Verification Team
Hi Justin,

We shall need more time to analyse this behaviour properly.
[10 Aug 2020 14:52] Steinar Gunderson
Hi,

You can try the following in your handler's explain_extra():

          if (pushed_cond != nullptr) {
                  return ", with pushed condition: " + ItemToString(pushed_cond);
          }
[10 Aug 2020 15:09] MySQL Verification Team
Let us know how it goes .....
[10 Aug 2020 15:12] Justin Swanhart
Oh interesting.  I will try that.   There is a related problem to this issue though that I filed, because join order is determined before ECP can estimate cost.
https://bugs.mysql.com/bug.php?id=100485
[14 Aug 2020 11:07] Justin Swanhart
This worked.  Thank you very much for the assistance.

mysql> explain select count(*) from t1 where c1 > 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1048576
     filtered: 33.33
        Extra: Using pushed condition (`test`.`t1`.`c1` > 1)
1 row in set, 1 warning (0.02 sec)

mysql> explain analyze select count(*) from t1 where c1 > 1\G
*************************** 1. row ***************************
EXPLAIN: -> Aggregate: count(0)  (actual time=0.896..0.897 rows=1 loops=1)
    -> Table scan on t1, with pushed condition: (t1.c1 > 1)  (cost=104857.60 rows=1048576) (actual time=0.892..0.892 rows=0 loops=1)

1 row in set (0.02 sec)
[14 Aug 2020 13:19] MySQL Verification Team
Thank you, Justin, for checking out our recommendation.

Also, please keep us posted about the development and application of your storage engine.