Bug #120187 Bug: COUNT(*) vs SUM(CASE WHEN ...) returns inconsistent results with NULL-sensitive joins
Submitted: 31 Mar 11:46
Reporter: Jasper Andrew Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:8.0.43 (MySQL Community Server - GPL) OS:Any (Docker container)
Assigned to: CPU Architecture:Any (x86_64)
Tags: join, Logic bug, null

[31 Mar 11:46] Jasper Andrew
Description:
When executing the following two logically equivalent queries in the same database environment, the results are inconsistent:

SQL-SOURCE uses COUNT(*) aggregation and returns 0.
SQL-TARGET uses SUM(CASE WHEN EXISTS(...) THEN 1 ELSE 0 END) and returns 5.

Both queries share the same main logic:

They involve users, orders, and posts tables with LEFT JOIN and INNER JOIN.
Both contain an EXISTS subquery checking conditions on orders and posts.
SQL-TARGET wraps the EXISTS subquery in a CASE WHEN statement, while SQL-SOURCE directly uses COUNT(*).

Observed Behavior:

SQL-SOURCE’s COUNT(*) does not count any matching rows, resulting in 0.
SQL-TARGET’s SUM(CASE WHEN EXISTS...) counts 5 rows where the EXISTS condition evaluates to true, resulting in 5.
Logically, both queries should return the same row count. The discrepancy suggests that the query optimizer may handle EXISTS or JOIN conditions differently, leading to inconsistent results.

How to repeat:
```SQL
-- SCHEMA
CREATE TABLE users (
    id           INT,
    username     VARCHAR(100),
    email        VARCHAR(255),
    age          INT,
    status       VARCHAR(20),
    created_at   TIMESTAMP NULL,
    score        DOUBLE
);

CREATE TABLE posts (
    id          INT,
    user_id     INT,
    title       VARCHAR(255),
    content     VARCHAR(1000),
    views       INT,
    likes       INT,
    created_at  TIMESTAMP NULL,
    rating      DOUBLE
);

CREATE TABLE orders (
    id          INT,
    user_id     INT,
    amount      DOUBLE,
    status      VARCHAR(20),
    created_at  TIMESTAMP NULL
);

INSERT INTO users VALUES
(1, 'alice', 'alice@test.com', 20, 'active',  '2022-01-01 10:00:00', 88.5),
(2, 'bob',   'bob@test.com',   30, 'active',  '2022-01-02 11:00:00', 92.3),
(3, 'carol', NULL,             NULL, 'banned','2022-01-03 12:00:00', NULL),
(4, 'dave',  'dave@test.com',  45, 'active',  '2022-01-04 13:00:00', 65.2),
(5, NULL,    'null@test.com',  18, 'inactive','2022-01-05 14:00:00', 70.0);

INSERT INTO posts VALUES
(1, 1, 'Hello World', 'First post', 100, 10, '2022-01-10 10:00:00', 4.5),
(2, 1, 'Another Post', NULL,        150, 20, '2022-01-11 11:00:00', 3.0),
(3, 2, 'Bob Post',     'Content',   NULL,  5, '2022-01-12 12:00:00', NULL),
(4, 3, NULL,           'Empty',     50,   2, '2022-01-13 13:00:00', 5.0),
(5, 4, 'Last Post',    'Last',      300,  30,'2022-01-14 14:00:00', 4.9);

INSERT INTO orders VALUES
(1, 1, 100.00, 'paid',    '2022-02-01 09:00:00'),
(2, 1, 200.50, 'shipped', '2022-02-02 10:00:00'),
(3, 2, NULL,   'failed',  '2022-02-03 11:00:00'),
(4, 3, 50.00,  'paid',    '2022-02-04 12:00:00'),
(5, 5, 999.99, 'paid',    '2022-02-05 13:00:00');

-- TRIGGER SQL:
SELECT COUNT(*)
FROM (
    SELECT 
        ref_0.email AS c0
    FROM 
        users AS ref_0
    WHERE 
        TRUE 
) AS subq_0
LEFT JOIN (
    SELECT  
        ref_1.score AS c0, 
        ref_1.age AS c1
    FROM 
        users AS ref_1
    WHERE 
          TRUE
) AS subq_1
INNER JOIN orders AS ref_2
    ON EXISTS (
        SELECT 1
        FROM users AS ref_5
        WHERE TRUE
    )
ON subq_0.c0 = ref_2.status
WHERE EXISTS (
    SELECT 1
    FROM orders AS ref_6
    RIGHT JOIN orders AS ref_7
        LEFT JOIN posts AS ref_8
            ON ('3oq' < '6668ey')
        LEFT JOIN orders AS ref_9
            ON (23.69 < (SELECT STDDEV_SAMP(id) FROM orders))
    ON (ref_6.status IS NULL)
    WHERE TRUE
);

-- result: {0}

SELECT SUM(
    CASE 
        WHEN EXISTS (
            SELECT 1
            FROM orders AS ref_6
            RIGHT JOIN orders AS ref_7
                LEFT JOIN posts AS ref_8
                    ON ('3oq' < '6668ey')
                LEFT JOIN orders AS ref_9
                    ON (23.69 < (SELECT STDDEV_SAMP(id) FROM orders))
            ON (ref_6.status IS NULL)
            WHERE TRUE
        ) 
        THEN 1 
        ELSE 0 
    END + 0
)
FROM (
    SELECT ref_0.email AS c0
    FROM users AS ref_0
    WHERE TRUE
) AS subq_0
LEFT JOIN (
    SELECT ref_1.score AS c0, 
           ref_1.age AS c1
    FROM users AS ref_1
    WHERE TRUE
) AS subq_1
INNER JOIN orders AS ref_2
    ON EXISTS (
        SELECT 1
        FROM users AS ref_5
        WHERE TRUE
    )
ON subq_0.c0 = ref_2.status;

-- result: {5}
```