Description:
This query computes a COUNT(*) over a cross join between orders and a LATERAL subquery derived from a Cartesian product of orders and posts, filtered by a scalar value taken from comments.
Logically, the scalar subquery:
SELECT id FROM comments LIMIT 1 OFFSET 3
returns the value 4 based on the provided dataset.
Therefore the inner filter becomes:
ref_1.user_id != 4
Since none of the orders rows have user_id = 4, all 5 rows in orders are retained.
The inner FROM clause forms a Cartesian product between orders and posts, producing 5 × 5 = 25 rows. However, the CASE expression:
CASE WHEN ref_0.id <= ref_0.id THEN ref_2.content ELSE ref_2.title END
is always true, so it always evaluates to ref_2.content.
Applying DISTINCT over posts.content yields the set:
"First post", NULL, "Content", "Empty", "Last"
After removing NULL, 4 distinct values remain.
Under correct LATERAL semantics (as implemented by PostgreSQL and DuckDB), the subquery is executed once per row of orders (5 rows), producing 5 × 4 = 20 rows in total. Therefore the expected result is COUNT(*) = 20.
However, MySQL returns COUNT(*) = 25. This indicates that MySQL likely performs an incorrect optimization or rewrite of the LATERAL subquery when combined with DISTINCT and a constant-foldable CASE expression. The optimizer appears to partially decorrelate or reorder joins in a way that violates LATERAL semantics, leading to an inflated row count.
Expected result: 20
Actual result (MySQL): 25
How to repeat:
```SQL
-- Schema
CREATE TABLE users (
id INT,
username VARCHAR(100),
email VARCHAR(255),
age INT,
status VARCHAR(20),
created_at TIMESTAMP NULL,
score DOUBLE
);
CREATE TABLE posts (
id INT,
user_id INT,
title VARCHAR(255),
content VARCHAR(1000),
views INT,
likes INT,
created_at TIMESTAMP NULL,
rating DOUBLE
);
CREATE TABLE comments (
id INT,
post_id INT,
user_id INT,
content VARCHAR(1000),
is_spam INT,
created_at TIMESTAMP NULL
);
CREATE TABLE orders (
id INT,
user_id INT,
amount DOUBLE,
status VARCHAR(20),
created_at TIMESTAMP NULL
);
INSERT INTO users VALUES
(1, 'alice', 'alice@test.com', 20, 'active', '2022-01-01 10:00:00', 88.5),
(2, 'bob', 'bob@test.com', 30, 'active', '2022-01-02 11:00:00', 92.3),
(3, 'carol', NULL, NULL, 'banned','2022-01-03 12:00:00', NULL),
(4, 'dave', 'dave@test.com', 45, 'active', '2022-01-04 13:00:00', 65.2),
(5, NULL, 'null@test.com', 18, 'inactive','2022-01-05 14:00:00', 70.0);
INSERT INTO posts VALUES
(1, 1, 'Hello World', 'First post', 100, 10, '2022-01-10 10:00:00', 4.5),
(2, 1, 'Another Post', NULL, 150, 20, '2022-01-11 11:00:00', 3.0),
(3, 2, 'Bob Post', 'Content', NULL, 5, '2022-01-12 12:00:00', NULL),
(4, 3, NULL, 'Empty', 50, 2, '2022-01-13 13:00:00', 5.0),
(5, 4, 'Last Post', 'Last', 300, 30,'2022-01-14 14:00:00', 4.9);
INSERT INTO comments VALUES
(1, 1, 2, 'Nice post', 0, '2022-01-20 10:00:00'),
(2, 1, 3, 'Spam here', 1, '2022-01-21 11:00:00'),
(3, 2, 1, 'Thanks', 0, '2022-01-22 12:00:00'),
(4, 4, 5, NULL, 0, '2022-01-23 13:00:00');
INSERT INTO orders VALUES
(1, 1, 100.00, 'paid', '2022-02-01 09:00:00'),
(2, 1, 200.50, 'shipped', '2022-02-02 10:00:00'),
(3, 2, NULL, 'failed', '2022-02-03 11:00:00'),
(4, 3, 50.00, 'paid', '2022-02-04 12:00:00'),
(5, 5, 999.99, 'paid', '2022-02-05 13:00:00');
-- Trigger SQL
SELECT COUNT(*)
FROM orders AS ref_0,
LATERAL (
SELECT DISTINCT
CASE
WHEN ref_0.id <= ref_0.id
THEN ref_2.content
ELSE ref_2.title
END AS c0
FROM orders AS ref_1, posts AS ref_2
WHERE ref_1.user_id != (
SELECT id
FROM comments
LIMIT 1 OFFSET 3
)
) AS subq_0
WHERE subq_0.c0 IS NOT NULL;
-- Result: {25} wrong!
```
Description: This query computes a COUNT(*) over a cross join between orders and a LATERAL subquery derived from a Cartesian product of orders and posts, filtered by a scalar value taken from comments. Logically, the scalar subquery: SELECT id FROM comments LIMIT 1 OFFSET 3 returns the value 4 based on the provided dataset. Therefore the inner filter becomes: ref_1.user_id != 4 Since none of the orders rows have user_id = 4, all 5 rows in orders are retained. The inner FROM clause forms a Cartesian product between orders and posts, producing 5 × 5 = 25 rows. However, the CASE expression: CASE WHEN ref_0.id <= ref_0.id THEN ref_2.content ELSE ref_2.title END is always true, so it always evaluates to ref_2.content. Applying DISTINCT over posts.content yields the set: "First post", NULL, "Content", "Empty", "Last" After removing NULL, 4 distinct values remain. Under correct LATERAL semantics (as implemented by PostgreSQL and DuckDB), the subquery is executed once per row of orders (5 rows), producing 5 × 4 = 20 rows in total. Therefore the expected result is COUNT(*) = 20. However, MySQL returns COUNT(*) = 25. This indicates that MySQL likely performs an incorrect optimization or rewrite of the LATERAL subquery when combined with DISTINCT and a constant-foldable CASE expression. The optimizer appears to partially decorrelate or reorder joins in a way that violates LATERAL semantics, leading to an inflated row count. Expected result: 20 Actual result (MySQL): 25 How to repeat: ```SQL -- Schema CREATE TABLE users ( id INT, username VARCHAR(100), email VARCHAR(255), age INT, status VARCHAR(20), created_at TIMESTAMP NULL, score DOUBLE ); CREATE TABLE posts ( id INT, user_id INT, title VARCHAR(255), content VARCHAR(1000), views INT, likes INT, created_at TIMESTAMP NULL, rating DOUBLE ); CREATE TABLE comments ( id INT, post_id INT, user_id INT, content VARCHAR(1000), is_spam INT, created_at TIMESTAMP NULL ); CREATE TABLE orders ( id INT, user_id INT, amount DOUBLE, status VARCHAR(20), created_at TIMESTAMP NULL ); INSERT INTO users VALUES (1, 'alice', 'alice@test.com', 20, 'active', '2022-01-01 10:00:00', 88.5), (2, 'bob', 'bob@test.com', 30, 'active', '2022-01-02 11:00:00', 92.3), (3, 'carol', NULL, NULL, 'banned','2022-01-03 12:00:00', NULL), (4, 'dave', 'dave@test.com', 45, 'active', '2022-01-04 13:00:00', 65.2), (5, NULL, 'null@test.com', 18, 'inactive','2022-01-05 14:00:00', 70.0); INSERT INTO posts VALUES (1, 1, 'Hello World', 'First post', 100, 10, '2022-01-10 10:00:00', 4.5), (2, 1, 'Another Post', NULL, 150, 20, '2022-01-11 11:00:00', 3.0), (3, 2, 'Bob Post', 'Content', NULL, 5, '2022-01-12 12:00:00', NULL), (4, 3, NULL, 'Empty', 50, 2, '2022-01-13 13:00:00', 5.0), (5, 4, 'Last Post', 'Last', 300, 30,'2022-01-14 14:00:00', 4.9); INSERT INTO comments VALUES (1, 1, 2, 'Nice post', 0, '2022-01-20 10:00:00'), (2, 1, 3, 'Spam here', 1, '2022-01-21 11:00:00'), (3, 2, 1, 'Thanks', 0, '2022-01-22 12:00:00'), (4, 4, 5, NULL, 0, '2022-01-23 13:00:00'); INSERT INTO orders VALUES (1, 1, 100.00, 'paid', '2022-02-01 09:00:00'), (2, 1, 200.50, 'shipped', '2022-02-02 10:00:00'), (3, 2, NULL, 'failed', '2022-02-03 11:00:00'), (4, 3, 50.00, 'paid', '2022-02-04 12:00:00'), (5, 5, 999.99, 'paid', '2022-02-05 13:00:00'); -- Trigger SQL SELECT COUNT(*) FROM orders AS ref_0, LATERAL ( SELECT DISTINCT CASE WHEN ref_0.id <= ref_0.id THEN ref_2.content ELSE ref_2.title END AS c0 FROM orders AS ref_1, posts AS ref_2 WHERE ref_1.user_id != ( SELECT id FROM comments LIMIT 1 OFFSET 3 ) ) AS subq_0 WHERE subq_0.c0 IS NOT NULL; -- Result: {25} wrong! ```