Description:
When performing an outer join such that the outer join generates NULL values in specific columns,
count()ing on those columns actually counts them instead of treating them as the NULLs that they
are.
How to repeat:
The relevant table descriptions and data contained within:
mysql> desc TrackableURL;
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| id | bigint(20) | | PRI | NULL | auto_increment |
| url | varchar(255) | | | | |
| campaignId | bigint(20) | | MUL | 0 | |
| html | tinyint(1) | | | 0 | |
+------------+--------------+------+-----+---------+----------------+
4 rows in set (0.00 sec)
mysql> desc EventEmailTrackUrl
-> ;
+-------------+------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------+------+-----+---------------------+----------------+
| id | bigint(20) | | PRI | NULL | auto_increment |
| tstamp | datetime | | | 0000-00-00 00:00:00 | |
| jobId | bigint(20) | | MUL | 0 | |
| campaignId | bigint(20) | YES | MUL | NULL | |
| memberId | bigint(20) | | MUL | 0 | |
| sendEventId | bigint(20) | | MUL | 0 | |
| urlId | bigint(20) | | MUL | 0 | |
+-------------+------------+------+-----+---------------------+----------------+
7 rows in set (0.00 sec)
mysql> select * from TrackableURL where campaignId =2;
+----+-----------------------+------------+------+
| id | url | campaignId | html |
+----+-----------------------+------------+------+
| 2 | http://www.moveon.org | 2 | 0 |
| 3 | http://www.moveon.org | 2 | 1 |
+----+-----------------------+------------+------+
2 rows in set (0.00 sec)
mysql> select * from EventEmailTrackUrl e WHERE e.campaignId = 2;
+----+---------------------+-------+------------+----------+-------------+-------
+
| id | tstamp | jobId | campaignId | memberId | sendEventId | urlId |
+----+---------------------+-------+------------+----------+-------------+-------
+
| 4 | 2004-01-15 14:17:45 | 2 | 2 | 2493 | 3 | 3 |
+----+---------------------+-------+------------+----------+-------------+-------
+
1 row in set (0.03 sec)
This is the query that matters:
mysql> SELECT t.url, count(memberId), count(distinct memberId), (count(distinct memberId) /
3)*100 as successRate, t.id FROM TrackableURL t LEFT OUTER JOIN EventEmailTrackUrl e ON t.id =
e.urlId WHERE t.campaignId = 2 AND t.html = 0 GROUP BY t.url;
+-----------------------+-----------------+--------------------------
+-------------+----+
| url | count(memberId) | count(distinct memberId) | successRate | id |
+-----------------------+-----------------+--------------------------
+-------------+----+
| http://www.moveon.org | 1 | 0 | 0.00 | 2 |
+-----------------------+-----------------+--------------------------
+-------------+----+
1 row in set (0.00 sec)
count(memberId) should be 0, not 1.
This query shows what we're counting:
mysql> SELECT t.url, memberId FROM TrackableURL t LEFT OUTER JOIN EventEmailTrackUrl e ON
t.id = e.urlId WHERE t.campaignId = 2 AND t.html = 0;+-----------------------+----------+
| url | memberId |
+-----------------------+----------+
| http://www.moveon.org | NULL |
+-----------------------+----------+
1 row in set (0.00 sec)
Notice, count() is clearly counting a NULL value even though it shouldn't.
Suggested fix:
Correctly implement count().