Bug #109726 Incorrect results with aggregated functions on correlated lateral joins
Submitted: 20 Jan 2023 21:35 Modified: 23 Jan 2023 13:52
Reporter: Anthony Green Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:8.0.31 OS:Ubuntu
Assigned to: CPU Architecture:x86
Tags: Contribution, join, LATERAL, regression

[20 Jan 2023 21:35] Anthony Green
Description:
LATERAL correlated joins produce unexpected, incorrect results when combined with aggregation functions.

How to repeat:
Run these queries:

CREATE TABLE label (
  id INT
);
INSERT INTO label VALUES 
  (0), 
  (1);

CREATE TABLE record (
  label_id INT,
  amount BIGINT,
  deleted BOOLEAN
);
INSERT INTO record VALUES
  (0, 1, 0),
  (1, -2, 0),
  (1, 3, 0),

  (0, -10, 1),
  (1, -10, 1);

SELECT
  L.id AS label_id,
  AMOUNTS.type,
  AMOUNTS.amount
FROM label AS L
CROSS JOIN LATERAL (
  SELECT
  CASE
    WHEN R.deleted = TRUE THEN NULL
    WHEN R.amount < 0 THEN 'NEGATIVE'
    WHEN R.amount >= 0 THEN 'POSITIVE'
    END AS type,
    SUM(R.amount) AS amount
  FROM record AS R
  WHERE R.label_id = L.id
  GROUP BY type
  HAVING type IS NOT NULL) AS AMOUNTS
ORDER BY label_id ASC;

This should produce:

label_id    type        amount
--------    --------    ------
0           POSITIVE    1
1           NEGATIVE    -2
1           POSITIVE    3

It actually produces:

label_id    type        amount
--------    --------    ------
0           POSITIVE    1
1           POSITIVE    -9

A second error can be seen by changing the above query to end:
ORDER BY label_id DESC;

It should produce:

label_id    type        amount
--------    --------    ------
1           NEGATIVE    -2
1           POSITIVE    3
0           POSITIVE    1

It actually produces:

label_id    type        amount
--------    --------    ------
1           NEGATIVE    -2
1           POSITIVE    3
0           POSITIVE    9

But instead it produces
[20 Jan 2023 21:36] Anthony Green
Description:
LATERAL correlated joins produce unexpected, incorrect results when combined with aggregation functions.

How to repeat:
Run these queries:

CREATE TABLE label (
  id INT
);
INSERT INTO label VALUES 
  (0), 
  (1);

CREATE TABLE record (
  label_id INT,
  amount BIGINT,
  deleted BOOLEAN
);
INSERT INTO record VALUES
  (0, 1, 0),
  (1, -2, 0),
  (1, 3, 0),

  (0, -10, 1),
  (1, -10, 1);

SELECT
  L.id AS label_id,
  AMOUNTS.type,
  AMOUNTS.amount
FROM label AS L
CROSS JOIN LATERAL (
  SELECT
  CASE
    WHEN R.deleted = TRUE THEN NULL
    WHEN R.amount < 0 THEN 'NEGATIVE'
    WHEN R.amount >= 0 THEN 'POSITIVE'
    END AS type,
    SUM(R.amount) AS amount
  FROM record AS R
  WHERE R.label_id = L.id
  GROUP BY type
  HAVING type IS NOT NULL) AS AMOUNTS
ORDER BY label_id ASC;

This should produce:

label_id    type        amount
--------    --------    ------
0           POSITIVE    1
1           NEGATIVE    -2
1           POSITIVE    3

It actually produces:

label_id    type        amount
--------    --------    ------
0           POSITIVE    1
1           POSITIVE    -9

A second error can be seen by changing the above query to end:
ORDER BY label_id DESC;

It should produce:

label_id    type        amount
--------    --------    ------
1           NEGATIVE    -2
1           POSITIVE    3
0           POSITIVE    1

It actually produces:

label_id    type        amount
--------    --------    ------
1           NEGATIVE    -2
1           POSITIVE    3
0           POSITIVE    9
[23 Jan 2023 13:52] MySQL Verification Team
Hi,

Thanks for the report and the test case. Reproduced as described.
[22 May 2023 5:03] quan chen
In Read of TemptableAggregateIterator, the current_ref_item_slice may be change to  m_ref_slice if m_ref_slice is valid, there is no problem with this without lateral. But if the aggregation is in a lateral derived, the Init of TemptableAggregateIterator may be called more than once, after the first call, the item of group will point to m_ref_slice(an in correct location), and then read the wrong record, ultimately resulting in incorrect query results.

Suggest fix:
```
--- a/sql/iterators/composite_iterators.cc
+++ b/sql/iterators/composite_iterators.cc
@@ -1703,6 +1703,7 @@ bool TemptableAggregateIterator<Profiler>::Init() {
       create_scope_guard([&] { table()->file->ha_index_end(); });
 
   PFSBatchMode pfs_batch_mode(m_subquery_iterator.get());
+  m_join->set_ref_item_slice(REF_SLICE_SAVED_BASE);
   for (;;) {
     int read_error = m_subquery_iterator->Read();
     if (read_error > 0 || thd()->is_error())  // Fatal error
```

Reset current_ref_item_slice of m_join to REF_SLICE_SAVED_BASE before do temptable aggregate.
[22 May 2023 5:03] quan chen
In Read of TemptableAggregateIterator, the current_ref_item_slice may be change to  m_ref_slice if m_ref_slice is valid, there is no problem with this without lateral. But if the aggregation is in a lateral derived, the Init of TemptableAggregateIterator may be called more than once, after the first call, the item of group will point to m_ref_slice(an in correct location), and then read the wrong record, ultimately resulting in incorrect query results.

Suggest fix:
```
--- a/sql/iterators/composite_iterators.cc
+++ b/sql/iterators/composite_iterators.cc
@@ -1703,6 +1703,7 @@ bool TemptableAggregateIterator<Profiler>::Init() {
       create_scope_guard([&] { table()->file->ha_index_end(); });
 
   PFSBatchMode pfs_batch_mode(m_subquery_iterator.get());
+  m_join->set_ref_item_slice(REF_SLICE_SAVED_BASE);
   for (;;) {
     int read_error = m_subquery_iterator->Read();
     if (read_error > 0 || thd()->is_error())  // Fatal error
```

Reset current_ref_item_slice of m_join to REF_SLICE_SAVED_BASE before do temptable aggregate.
[22 May 2023 6:22] MySQL Verification Team
Hello quan chen,

Thank you for your contribution.
We would like to consider your contribution but observed that you haven't signed OCA. Please note that in order to submit contributions you must first sign the Oracle Contribution Agreement (OCA). For additional information please check https://oca.opensource.oracle.com.
If you have any questions, please contact the MySQL community team https://dev.mysql.com/community/

regards,
Umesh
[29 May 2023 11:57] quan chen
Hello Umesh,

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: fix.patch (application/octet-stream, text), 595 bytes.

[29 May 2023 11:57] quan chen
Hello Umesh,

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: fix.patch (application/octet-stream, text), 595 bytes.

[29 May 2023 11:57] quan chen
Hello Umesh,

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: fix.patch (application/octet-stream, text), 595 bytes.

[29 May 2023 11:57] quan chen
Hello Umesh,

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: fix.patch (application/octet-stream, text), 595 bytes.

[29 May 2023 11:57] quan chen
Hello Umesh,

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: fix.patch (application/octet-stream, text), 595 bytes.

[29 May 2023 12:01] MySQL Verification Team
Thank you, quan chen.

regards,
Umesh