Bug #113570 composite index with the indexed expression "is not null" not being used
Submitted: 5 Jan 2024 23:35 Modified: 8 Jan 2024 11:26
Reporter: Felix Livni Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:8.0.33 OS:Ubuntu
Assigned to: CPU Architecture:Any
Tags: composite-index, INDEX, indexed-expression

[5 Jan 2024 23:35] Felix Livni
Description:
I created a composite index where one of the parts is the expression "column is not null". 

e.g. INDEX(column1, ("column2 is not null"))

analyze SELECT column1, column2 from table where column1 = '1' and column2 is not null.

The analyze shows that the second part of the index is not being used.

If I create the index in the other order, i.e.
e.g. INDEX(("column2 is not null"), column1)

Then analyze shows that the index is not used at all.

There is a discussion here:
https://stackoverflow.com/questions/77744482/mysql-8-composite-index-with-the-indexed-expr...

I have a dbfiddle that you might find helpful; and if it sticks around long enough:
https://www.db-fiddle.com/f/xs4isS79x1pBQg8Noc82ZZ/1

How to repeat:
```
CREATE TABLE appointments 
(
  id INT AUTO_INCREMENT PRIMARY KEY,
  user_id INT NOT NULL,
  recurrence_rule VARCHAR(255)
);

CREATE INDEX idx1
ON appointments ((recurrence_rule IS NOT NULL), user_id);

explain analyze select * from appointments where user_id = 7 and recurrence_rule is not null;
```

-> Filter: ((appointments.user_id = 7) and (appointments.recurrence_rule is not null)) (cost=150.25 rows=135) (actual time=0.356..0.625 rows=4 loops=1) -> Table scan on appointments (cost=150.25 rows=1500) (actual time=0.143..0.528 rows=1500 loops=1)

You can generate sample data using this:

```
CREATE PROCEDURE GenerateData()
BEGIN
    -- Declare variables
    DECLARE i INT DEFAULT 1; -- Loop counter
    DECLARE user_id INT;
    DECLARE recurrence_rule VARCHAR(255);
    
    -- every 25 appointments is a recurring appointment
    -- each user gets 100 appointments
    WHILE i <= 1500 DO
        SET user_id = CEIL(i / 100.0); -- Calculate user_id, ensure it's rounded up
        
        -- Check if the appointment is recurring
        IF i % 25 = 0 THEN
            SET recurrence_rule = CONCAT('FREQ=', ELT(FLOOR(1 + RAND() * 4), 'DAILY', 'WEEKLY', 'MONTHLY', 'YEARLY'), ';INTERVAL=', FLOOR(1 + RAND() * 10));
        ELSE
            -- Set recurrence_rule to NULL for non-recurring appointments
            SET recurrence_rule = NULL;
        END IF;
        
        -- Insert data into appointments table
        INSERT INTO appointments (user_id, recurrence_rule)
        VALUES (user_id, recurrence_rule);
        
        -- Increment the loop counter
        SET i = i + 1;
    END WHILE;
END
```

Suggested fix:
Using the sample data, I should see analyze use the full index and scan only 4 records.
[7 Jan 2024 20:01] Knut Anders Hatlen
I believe this is a documented limitation.

https://dev.mysql.com/doc/refman/8.0/en/create-index.html#create-index-functional-key-part... says:

> Functional indexes are implemented as hidden virtual generated columns, which has these implications:
> (...)
> Functional key parts inherit all restrictions that apply to generated columns.

https://dev.mysql.com/doc/refman/8.0/en/generated-column-index-optimizations.html says:

> MySQL supports indexes on generated columns
>(...)
> The optimization applies to these operators: =, <, <=, >, >=, BETWEEN, and IN().

The "column2 is not null" predicate in the query does not use any of those operators, so the optimization does not apply.

A workaround is to add "= true" to the predicate to transform it to one where the optimization does apply:

mysql> explain analyze select * from appointments where user_id = 7 and recurrence_rule is not
null = true \G
*************************** 1. row ***************************
EXPLAIN: -> Index lookup on appointments using idx1 ((recurrence_rule is not null)=true, user_id=7)  (cost=0.35 rows=1) (actual time=0.0563..0.0563 rows=0 loops=1)

1 row in set (0.00 sec)
[8 Jan 2024 11:26] MySQL Verification Team
Hi Mr. Livni,

Thank you for your bug report.

We managed to repeat the behaviour that you reported.

However, we have to agree with our colleague that this behaviour is intended and fully documented here:

https://dev.mysql.com/doc/refman/8.0/en/create-index.html#create-index-functional-key-part...

https://dev.mysql.com/doc/refman/8.0/en/generated-column-index-optimizations.html

Not a bug.