Bug #19375 ref-Index scans loops through NULL records even though later discarded
Submitted: 26 Apr 2006 15:07 Modified: 18 Sep 2006 7:52
Reporter: Jochen Riehm Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.0.23-BK, 5.0.19 OS:Linux (Linux)
Assigned to: Assigned Account CPU Architecture:Any

[26 Apr 2006 15:07] Jochen Riehm
Description:
When LEFT JOIN-ing a table to an already LEFT JOIN-ed table by a secondary key that has lots of NULL values the JOIIN-Engine seems to loop through all of the NULL-entries in the table when the first LEFT JOIN has found no entry and the reference value is thus NULL, even though the records will not be delivered to the result  since NOT(NULL=NULL).  See "How to repeat" for details. The EXPLAIN does not  show this behaviour and looks "good".

This might  have the same reason as bug 19346. I file it as another bug because the description varies significantly: here "ref" index usage, there "range", here reference with IN (...) here reference to LEFT JOINed table.

How to repeat:
Load tables test1, test2, test3 from attached dump.

Execute:

SELECT VERSION();
ANALYZE TABLE test1,test2,test3;
EXPLAIN
SELECT  
COUNT(*)
 FROM test2 a 
 LEFT JOIN test3 ap ON ( a.id = ap.auto_id AND ( ap.bis > NOW() OR ap.bis IS NULL ) )
 LEFT JOIN test1 abp 
 ON (ap.auto_id = abp.auto_id
  AND (ap.von < abp.bis OR ISNULL(abp.bis) )
  AND (ISNULL(ap.bis) OR ap.bis > abp.von))
 WHERE  a.verkaufdat IS NOT NULL AND LENGTH(a.verkaufdat) > 0  AND a.firma_id IN (3,3000) 
;
SELECT  
COUNT(*)
 FROM test2 a 
 LEFT JOIN test3 ap ON ( a.id = ap.auto_id AND ( ap.bis > NOW() OR ap.bis IS NULL ) )
 LEFT JOIN test1 abp 
 ON (ap.auto_id = abp.auto_id
  AND (ap.von < abp.bis OR ISNULL(abp.bis) )
  AND (ISNULL(ap.bis) OR ap.bis > abp.von))
 WHERE  a.verkaufdat IS NOT NULL AND LENGTH(a.verkaufdat) > 0  AND a.firma_id IN (3,3000) 

with result:

+------------------------+---------+----------+-----------------------------+
| Table                  | Op      | Msg_type | Msg_text                    |
+------------------------+---------+----------+-----------------------------+
| dbcarsharing_tmp.test1 | analyze | status   | Table is already up to date |
| dbcarsharing_tmp.test2 | analyze | status   | Table is already up to date |
| dbcarsharing_tmp.test3 | analyze | status   | OK                          |
+------------------------+---------+----------+-----------------------------+
3 rows in set (0.04 sec)

+----+-------------+-------+-------+--------------------------------+-------------+---------+-----------------------------+------+-------------+
| id | select_type | table | type  | possible_keys                  | key         | key_len | ref                         | rows | Extra  |
+----+-------------+-------+-------+--------------------------------+-------------+---------+-----------------------------+------+-------------+
|  1 | SIMPLE      | a     | range | altautofind,firma_id           | altautofind | 4       | NULL                        | 1440 | Using where |
|  1 | SIMPLE      | ap    | ref   | auto_id,doppelt_aber_gebraucht | auto_id     | 4       | dbcarsharing_tmp.a.id       |    2 |  |
|  1 | SIMPLE      | abp   | ref   | auto_id                        | auto_id     | 5       | dbcarsharing_tmp.ap.auto_id |    1 |  |
+----+-------------+-------+-------+--------------------------------+-------------+---------+-----------------------------+------+-------------+
3 rows in set (0.02 sec)

+----------+
| COUNT(*) |
+----------+
|      793 |
+----------+
1 row in set (7.53 sec)

Slow log shows query time and examined rows not  consistent with explain.

Suggested fix:
If JOIN condition is "=" and not "<=>" do not perform index lookups for NULL values.
[26 Apr 2006 15:08] Jochen Riehm
FYI: the value of myisam_stats_method has no effect on described behaviour. Just EXPLAIN looks a little different.
[26 Apr 2006 15:16] Jochen Riehm
To verify that this is actually a problem with NULL-values do the following:

ALTER TABLE `test1` CHANGE `auto_id` `auto_id` INT( 11 ) NOT NULL DEFAULT '0'

and run statements again. Result:

+---------------------+
| VERSION()           |
+---------------------+
| 5.0.19-standard-log |
+---------------------+
1 row in set (0.00 sec)

+------------------------+---------+----------+-----------------------------+
| Table                  | Op      | Msg_type | Msg_text                    |
+------------------------+---------+----------+-----------------------------+
| dbcarsharing_tmp.test1 | analyze | status   | Table is already up to date |
| dbcarsharing_tmp.test2 | analyze | status   | Table is already up to date |
| dbcarsharing_tmp.test3 | analyze | status   | Table is already up to date |
+------------------------+---------+----------+-----------------------------+
3 rows in set (0.00 sec)

+----+-------------+-------+-------+--------------------------------+-------------+---------+-----------------------------+------+-------------+
| id | select_type | table | type  | possible_keys                  | key         | key_len | ref                         | rows | Extra  |
+----+-------------+-------+-------+--------------------------------+-------------+---------+-----------------------------+------+-------------+
|  1 | SIMPLE      | a     | range | altautofind,firma_id           | altautofind | 4       | NULL                        | 1440 | Using where |
|  1 | SIMPLE      | ap    | ref   | auto_id,doppelt_aber_gebraucht | auto_id     | 4       | dbcarsharing_tmp.a.id       |    2 |  |
|  1 | SIMPLE      | abp   | ref   | auto_id                        | auto_id     | 4       | dbcarsharing_tmp.ap.auto_id |    3 |  |
+----+-------------+-------+-------+--------------------------------+-------------+---------+-----------------------------+------+-------------+
3 rows in set (0.00 sec)

+----------+
| COUNT(*) |
+----------+
|      793 |
+----------+
1 row in set (0.03 sec)
[11 Jun 2006 14:36] Valeriy Kravchuk
Thank you for a bug report. Verified just as described, with your test data and statements, on 5.0.23-BK. Query worked for 11+ seconds in my case. In slow log I've got:

# Time: 060611 13:09:00
# User@Host: root[root] @ localhost []
# Query_time: 11  Lock_time: 0  Rows_sent: 1  Rows_examined: 2594467
SET timestamp=1150020540;
SELECT
COUNT(*)
 FROM test2 a
 LEFT JOIN test3 ap ON ( a.id = ap.auto_id AND ( ap.bis > NOW() OR ap.bis IS
NULL ) )
 LEFT JOIN test1 abp
 ON (ap.auto_id = abp.auto_id
  AND (ap.von < abp.bis OR ISNULL(abp.bis) )
  AND (ISNULL(ap.bis) OR ap.bis > abp.von))
 WHERE  a.verkaufdat IS NOT NULL AND LENGTH(a.verkaufdat) > 0  AND a.firma_id IN
(3,3000);

Note rows examined! Same results for both 
myisam_stats_method=nulls_equal and myisam_stats_method=nulls_unequal.
[14 Sep 2006 18:16] Sergey Petrunya
This problem was fixed by fix for BUG#21390, in the same way BUG#19649 (see BIG#1649, comment dated 14 Sep 19:39)
[18 Sep 2006 7:52] Jochen Riehm
As I do not have access to duplicate bug 21390, could you please copy in here the MySQL version in which this bug has been fixed?
[18 Sep 2006 9:12] Valeriy Kravchuk
Bug #21390 is fixed in 5.0.25 and 5.1.12.