Bug #2416 COUNT() of a field that contains NULLs as a result of an outer join
Submitted: 15 Jan 2004 15:32 Modified: 15 Jan 2004 15:52
Reporter: Bob Van Zant Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: ISAM storage engine Severity:S2 (Serious)
Version:4.0.16 OS:Linux (Redhat Linux)
Assigned to: Dean Ellis CPU Architecture:Any

[15 Jan 2004 15:32] Bob Van Zant
Description:
When performing an outer join such that the outer join generates NULL values in specific columns, 
count()ing on those columns actually counts them instead of treating them as the NULLs that they 
are.

How to repeat:
The relevant table descriptions and data contained within:

mysql> desc TrackableURL;
+------------+--------------+------+-----+---------+----------------+
| Field      | Type         | Null | Key | Default | Extra          |
+------------+--------------+------+-----+---------+----------------+
| id         | bigint(20)   |      | PRI | NULL    | auto_increment |
| url        | varchar(255) |      |     |         |                |
| campaignId | bigint(20)   |      | MUL | 0       |                |
| html       | tinyint(1)   |      |     | 0       |                |
+------------+--------------+------+-----+---------+----------------+
4 rows in set (0.00 sec)

mysql> desc EventEmailTrackUrl
    -> ;
+-------------+------------+------+-----+---------------------+----------------+
| Field       | Type       | Null | Key | Default             | Extra          |
+-------------+------------+------+-----+---------------------+----------------+
| id          | bigint(20) |      | PRI | NULL                | auto_increment |
| tstamp      | datetime   |      |     | 0000-00-00 00:00:00 |                |
| jobId       | bigint(20) |      | MUL | 0                   |                |
| campaignId  | bigint(20) | YES  | MUL | NULL                |                |
| memberId    | bigint(20) |      | MUL | 0                   |                |
| sendEventId | bigint(20) |      | MUL | 0                   |                |
| urlId       | bigint(20) |      | MUL | 0                   |                |
+-------------+------------+------+-----+---------------------+----------------+
7 rows in set (0.00 sec)

mysql> select * from TrackableURL where campaignId =2;
+----+-----------------------+------------+------+
| id | url                   | campaignId | html |
+----+-----------------------+------------+------+
|  2 | http://www.moveon.org |          2 |    0 |
|  3 | http://www.moveon.org |          2 |    1 |
+----+-----------------------+------------+------+
2 rows in set (0.00 sec)

mysql> select * from EventEmailTrackUrl e WHERE e.campaignId = 2;
+----+---------------------+-------+------------+----------+-------------+-------
+
| id | tstamp              | jobId | campaignId | memberId | sendEventId | urlId |
+----+---------------------+-------+------------+----------+-------------+-------
+
|  4 | 2004-01-15 14:17:45 |     2 |          2 |     2493 |           3 |     3 |
+----+---------------------+-------+------------+----------+-------------+-------
+
1 row in set (0.03 sec)

This is the query that matters:

mysql> SELECT t.url, count(memberId), count(distinct memberId), (count(distinct memberId) / 
3)*100 as successRate, t.id FROM TrackableURL t LEFT OUTER JOIN EventEmailTrackUrl e ON t.id = 
e.urlId WHERE t.campaignId = 2 AND t.html = 0 GROUP BY t.url;
+-----------------------+-----------------+--------------------------
+-------------+----+
| url                   | count(memberId) | count(distinct memberId) | successRate | id |
+-----------------------+-----------------+--------------------------
+-------------+----+
| http://www.moveon.org |               1 |                        0 |        0.00 |  2 |
+-----------------------+-----------------+--------------------------
+-------------+----+
1 row in set (0.00 sec)

count(memberId) should be 0, not 1.

This query shows what we're counting:
mysql> SELECT t.url, memberId  FROM TrackableURL t LEFT OUTER JOIN EventEmailTrackUrl e ON 
t.id = e.urlId WHERE t.campaignId = 2 AND t.html = 0;+-----------------------+----------+
| url                   | memberId |
+-----------------------+----------+
| http://www.moveon.org |     NULL |
+-----------------------+----------+
1 row in set (0.00 sec)

Notice, count() is clearly counting a NULL value even though it shouldn't.

Suggested fix:
Correctly implement count().
[15 Jan 2004 15:52] Dean Ellis
I cannot repeat this with 4.0.16 or 4.0.18, nor with any storage engine (you have ISAM listed, but do not have your actual CREATE TABLE statements, so I tried several).

In all cases it returns the COUNT() correctly.

If you still see this issue, then perhaps you could create a small, complete test case including the SQL to create, populate and query the tables, which demonstrates the behavior.
[15 Jan 2004 16:04] Bob Van Zant
I tried to reproduce the bug within a different database (containing only the two relevant tables) 
and was unable to. The counts were being returned as they should be. I'll keep digging. Thanks for 
the prompt response.
--Bob Van Zant