Bug #99592 GROUP BY includes wrong data
Submitted: 15 May 2020 15:23 Modified: 18 May 2020 4:37
Reporter: Billy Sullivan Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:8.0.20 OS:Ubuntu
Assigned to: CPU Architecture:Any

[15 May 2020 15:23] Billy Sullivan
Description:
GROUP BY in MySQL 8 returns wrong values in some cases. Consider this query:

SELECT
	o.order_id,
	insurance.value AS insurance,
	insurance.order_id,
	max(insurance.value),
	group_concat(insurance.order_id)
FROM `oc_order` o
	LEFT JOIN oc_order_total insurance ON insurance.order_id = o.order_id AND insurance.code = 'misc'
WHERE o.order_status_id > 0
AND o.order_id > 277252000 
AND o.order_id < 277255000
GROUP BY o.order_id;

I am getting these results:

+-----------+-----------+-----------+----------------------+----------------------------------+
| order_id  | insurance | order_id  | max(insurance.value) | group_concat(insurance.order_id) |
+-----------+-----------+-----------+----------------------+----------------------------------+
| 100053000 |      NULL |      NULL |               9.5398 | 100053000                        |
| 100053001 |    9.5398 | 100053000 |                 NULL | NULL                             |
| 100054000 |    2.9051 | 100054000 |               2.9051 | 100054000                        |
+-----------+-----------+-----------+----------------------+----------------------------------+

I understand that the values for the insurance table "contains nonaggregated column", but should it at least choose *an arbitrary value* from the possible rows in question?  In this case there are only three rows to begin with and running the same query without GROUP BY gives me this:

+-----------+-----------+-----------+
| order_id  | insurance | order_id  |
+-----------+-----------+-----------+
| 100053000 |    9.5398 | 100053000 |
| 100053001 |      NULL |      NULL |
| 100054000 |    2.9051 | 100054000 |
+-----------+-----------+-----------+

Why would GROUP BY shift the values like this? Shouldn't it at least choose an arbitrary value from the rows in the group?

https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=a81e2abb67d491c984b435cee06160bb

How to repeat:
CREATE TABLE `oc_order` (
  `order_id` int(11) NOT NULL AUTO_INCREMENT,
  `order_status_id` int(11),
  PRIMARY KEY (`order_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `oc_order_total` (
  `order_total_id` int(10) NOT NULL AUTO_INCREMENT,
  `order_id` int(11) NOT NULL,
  `code` varchar(32) NOT NULL,
  `title` varchar(255) NOT NULL,
  `text` varchar(255) NOT NULL,
  `value` decimal(15,4) NOT NULL DEFAULT '0.0000',
  `sort_order` int(3) NOT NULL,
  PRIMARY KEY (`order_total_id`),
  KEY `order_id` (`order_id`),
  KEY `code` (`code`),
  KEY `value` (`value`),
  KEY `order_id_code` (`order_id`,`code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `oc_order` (`order_id`, `order_status_id`)
VALUES
	(100053000, 1),
	(100053001, 1),
	(100054000, 1);

INSERT INTO `oc_order_total` (`order_total_id`, `order_id`, `code`, `title`, `text`, `value`, `sort_order`)
VALUES
	(1, 100053000, 'misc', 'Shipping Insurance', '$9.54', 9.5398, 4),
	(2, 100054000, 'misc', 'Shipping Insurance', '$2.91', 2.9051, 4);

SET SQL_MODE = '';
SELECT
	o.order_id,
	insurance.value AS insurance,
	insurance.order_id
	,max(insurance.value),
	group_concat(insurance.order_id)
FROM `oc_order` o
	LEFT JOIN oc_order_total insurance ON insurance.order_id = o.order_id AND insurance.code = 'misc'
WHERE order_status_id > 0
GROUP BY order_id;
[15 May 2020 15:25] Billy Sullivan
sorry - a couple of the details in the original report are wrong - the correct schema and queries to reproduce are in the dbfiddle.uk link and are as follows:

CREATE TABLE `oc_order` (
  `order_id` int(11) NOT NULL AUTO_INCREMENT,
  `order_status_id` int(11),
  PRIMARY KEY (`order_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `oc_order_total` (
  `order_total_id` int(10) NOT NULL AUTO_INCREMENT,
  `order_id` int(11) NOT NULL,
  `code` varchar(32) NOT NULL,
  `title` varchar(255) NOT NULL,
  `text` varchar(255) NOT NULL,
  `value` decimal(15,4) NOT NULL DEFAULT '0.0000',
  `sort_order` int(3) NOT NULL,
  PRIMARY KEY (`order_total_id`),
  KEY `order_id` (`order_id`),
  KEY `code` (`code`),
  KEY `value` (`value`),
  KEY `order_id_code` (`order_id`,`code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `oc_order` (`order_id`, `order_status_id`)
VALUES
	(100053000, 1),
	(100053001, 1),
	(100054000, 1);

INSERT INTO `oc_order_total` (`order_total_id`, `order_id`, `code`, `title`, `text`, `value`, `sort_order`)
VALUES
	(1, 100053000, 'misc', 'Shipping Insurance', '$9.54', 9.5398, 4),
	(2, 100054000, 'misc', 'Shipping Insurance', '$2.91', 2.9051, 4);

SET SQL_MODE = '';
SELECT
	o.order_id,
	insurance.value AS insurance,
	insurance.order_id
	,max(insurance.value),
	group_concat(insurance.order_id)
FROM `oc_order` o
	LEFT JOIN oc_order_total insurance ON insurance.order_id = o.order_id AND insurance.code = 'misc'
WHERE order_status_id > 0
GROUP BY o.order_id;
[15 May 2020 19:00] MySQL Verification Team
Thank you for the bug report.
[18 May 2020 4:37] Erlend Dahl
Duplicate of

Bug#99398 Data in wrong row on left join

which has been fixed in the upcoming 8.0.21.