MySQL Bugs: #8877: simple 'ref'-type join reads through all NULL values when looking for =

Bug #8877	simple 'ref'-type join reads through all NULL values when looking for =
Submitted:	2 Mar 2005 0:26	Modified:	20 Nov 2010 23:26
Reporter:	Timothy Smith	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S2 (Serious)
Version:	any	OS:	Any (any)
Assigned to:		CPU Architecture:	Any

Description:
When joining two tables, both with many NULL values in the indexed field, an = or <> condition compares all of the rows.

It seems that it should skip all but the first NULL row in the second table of the join.

How to repeat:
create table a (x int, key (x)); create table b (x int, key (x));

insert into a values (NULL);

insert into a select NULL from a;

-- Hit <Up><Enter> a few times to get 1024 rows in 'a'

insert into b select NULL from a limit 100;

flush status;

select count(*) from a, b where a.x = b.x;

show status like 'Handler_read_%';
+-----------------------+--------+
| Variable_name         | Value  |
+-----------------------+--------+
| Handler_read_first    | 1      |
| Handler_read_key      | 100    |
| Handler_read_next     | 102500 |
| Handler_read_prev     | 0      |
| Handler_read_rnd      | 0      |
| Handler_read_rnd_next | 0      |
+-----------------------+--------+

If there are several thousand rows with NULL values in both tables, this query can take a very long time to return.

By the way, I tested this with both MyISAM and InnoDB, using MySQL 4.1.10 on Linux.  The customer tested it on several versions of 4.0 as well as on 4.1.10.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/23386

The patch was rejected, will work on requested changes

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/23575

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/23637

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/23757

Fixed in 4.1.12 tree (and pushed to 5.0.4)

Notes for the docs team:
The section 7.2.1 of the manual says 
 "ref can be used for indexed columns that are compared using the = operator".
Maybe it's worth to mention that null-aware '<=>' can be used here too.

The fix for this bug does the following:
If the ref access method uses "tbl.key = othertbl.field"  MySQL is able to figure that this also implies that "othertbl.field IS NOT NULL" and adds this to table condition of table othertbl.
The added "othertbl.field IS NOT NULL" predicate can be only applied after a record has been retrieved from the table othertbl.

I am experiencing a very similar problem with:

CREATE TABLE a (
id int(10) unsigned default NULL
);

CREATE TABLE b (
id int(10) unsigned default NULL,
UNIQUE KEY id_key (id)
)

INSERT INTO a VALUES (1), (2), (3), (4), (5), (6), (NULL), (NULL), (NULL);

INSERT INTO b VALUES (1), (2), (3), (4), (5), (6), (7), (8), (NULL), (NULL), (NULL), (NULL), (NULL), ..., (NULL);

Launching this query:

SELECT * FROM a, b WHERE a.id = b.id;

will take a long long time (directly proportional to the number of a.id NULL values until I delete them, the time is then near 0s).

Launching this query:

SELECT * FROM a, b WHERE a.id = b.id WHERE a.id IS NOT NULL;

Is very fast... But in my case, I have to retrieve ALL the rows from table a, so, I use the following left join query:

SELECT * FROM a LEFT JOIN b ON a.id = b.id;

In this case, does the patch optimize correctly the query AND returns NULL values of a.id ?

Patrick Allaert

Noted in 4.1.12, 5.0.4 changelogs.

Pushed into 6.0.14-alpha (revid:alik@sun.com-20091211055901-yp18b3c7xuhl87rf) (version source revid:alik@sun.com-20091211055401-43rjwq7gjed6ds83) (merge vers: 6.0.14-alpha) (pib:13)

Noted in 6.0.14 changelog.

Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)

Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)