Description:
I have issues understanding the limitation of the is NULL optimization from the documentation at https://dev.mysql.com/doc/refman/5.7/en/is-null-optimization.html .
One minor point on the following part. Is it two separate indexes on the two columns, or a composite index? I'll assume composite index.
> Some examples of queries that are optimized, assuming that there is an index on columns a and b of table t2:
My main issue is with the following part at the end on limitations of the optimization:
> The optimization can handle only one IS NULL level. In the following query, MySQL uses key lookups only on the expression (t1.a=t2.a AND t2.a IS NULL) and is not able to use the key part on b:
> SELECT * FROM t1, t2
> WHERE (t1.a=t2.a AND t2.a IS NULL)
> OR (t1.b=t2.b AND t2.b IS NULL);
I don't understand how this example is relevant, specifically the `OR (t1.b=t2.b AND t2.b IS NULL)` part. From the doc of composite indexes, "MySQL cannot use the index to perform lookups if the columns do not form a leftmost prefix of the index.". I consider `OR (t1.b=t2.b AND t2.b IS NULL)` unable to make use of the index because that subclause doesn't constrain `t2.a`. I don't understand why it's used to explain a limitation of the is NULL optimization. What does "level" refer to in "The optimization can handle only one IS NULL level.".
A query that would make more sense to me for illustrating the limitation is this one:
SELECT * FROM t1, t2
WHERE (t1.a=t2.a OR t2.a IS NULL)
AND (t1.b=t2.b OR t2.b IS NULL);
I've tried so, but the `ref_or_null` join type is still reported by `EXPLAIN`. Therefore I don't understand the limitation documented. A complete example is available in the following section.
I've been searching a bit on the topic but couldn't find much except at
https://stackoverflow.com/questions/289001/does-mysql-index-null-values . But that post doesn't help much on the is NULL optimization.
Am I missing something important from the doc? If not, maybe the doc could elaborate a bit on the limitation.
How to repeat:
CREATE TABLE t1 (
id INT AUTO_INCREMENT NOT NULL,
a CHAR(30),
b CHAR(30),
PRIMARY KEY (id)
);
CREATE TABLE t2 (
id INT AUTO_INCREMENT NOT NULL,
a CHAR(30),
b CHAR(30),
PRIMARY KEY (id),
INDEX name (a,b)
);
insert into t1 (a, b) values ('Calamity', 'Jane'), ('The Kid', 'Billy'), ('James', 'Jesse'), ('Jack', NULL)
insert into t2 (a, b) values ('Calamity', 'Jane'), ('The Kid', 'Billy'), ('James', 'Jesse'), ('Jack', NULL)
explain SELECT * FROM t1, t2
WHERE (t1.a=t2.a OR t2.a IS NULL)
AND (t1.b=t2.b OR t2.b IS NULL);