Bug #101228 date group and sort groups similar dates in different groups
Submitted: 19 Oct 2020 15:55 Modified: 3 Nov 2020 23:40
Reporter: Freddy Andersen Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: DML Severity:S2 (Serious)
Version:8.0.22 OS:Any
Assigned to: CPU Architecture:Any

[19 Oct 2020 15:55] Freddy Andersen
Description:
following query started creating multiple groups from a date group in version 8.0.20 was working fine in 8.0.19.

mysql> SELECT DATE(timestamp) d, COUNT(distinct code) c
    -> FROM tt
    -> WHERE timestamp >= curdate() - interval 30 day
    -> GROUP BY d
    -> ORDER BY d desc;
+------------+----+
| d          | c  |
+------------+----+
| 2020-10-18 |  1 |
| 2020-10-19 |  3 |
| 2020-10-18 |  2 |
| 2020-10-19 |  4 |
| 2020-10-18 |  1 |
| 2020-10-19 |  3 |
| 2020-10-18 |  5 |
| 2020-10-17 |  1 |
| 2020-10-18 |  1 |
| 2020-10-17 |  1 |
| 2020-10-18 |  5 |
| 2020-10-17 |  2 |
| 2020-10-18 |  5 |
| 2020-10-17 |  1 |
| 2020-10-18 |  4 |
| 2020-10-17 |  1 |
...

How to repeat:
CREATE SCHEMA AA;
USE AA;

DROP TABLE IF EXISTS `tt`;
CREATE TABLE `tt` (
  `code` smallint NOT NULL,
  `timestamp` timestamp NOT NULL,
  `id` bigint NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`),
  KEY `updated` (`timestamp`),
  KEY `code` (`code`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;

DROP PROCEDURE IF EXISTS `prepare_data`;
DELIMITER $$
CREATE PROCEDURE prepare_data()
BEGIN
  DECLARE i INT DEFAULT 1;
  SET @MIN = NOW() - interval 30 day;
  SET @MAX = NOW();
  WHILE i < 1000 DO
    INSERT INTO tt(code,timestamp) VALUES (FLOOR(RAND()*1000), TIMESTAMPADD(SECOND, FLOOR(RAND() * TIMESTAMPDIFF(SECOND, @MIN, @MAX)), @MIN));
    SET i = i + 1;
  END WHILE;
END$$
DELIMITER ;

CALL prepare_data();

SELECT DATE(timestamp) d, COUNT(distinct code) c
FROM tt
WHERE timestamp >= curdate() - interval 30 day
GROUP BY d
ORDER BY d desc;
[19 Oct 2020 16:39] MySQL Verification Team
Thank you for the bug report. The below result it isn't the expected;?

Enter password: **********
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 10
Server version: 8.0.22 MySQL Community Server - GPL

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE SCHEMA AA;
Query OK, 1 row affected (0.01 sec)

mysql> USE AA;
Database changed
mysql>
mysql> DROP TABLE IF EXISTS `tt`;
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> CREATE TABLE `tt` (
    ->   `code` smallint NOT NULL,
    ->   `timestamp` timestamp NOT NULL,
    ->   `id` bigint NOT NULL AUTO_INCREMENT,
    ->   PRIMARY KEY (`id`),
    ->   KEY `updated` (`timestamp`),
    ->   KEY `code` (`code`)
    -> ) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.12 sec)

mysql> DROP PROCEDURE IF EXISTS `prepare_data`;
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> DELIMITER $$
mysql> CREATE PROCEDURE prepare_data()
    -> BEGIN
    ->   DECLARE i INT DEFAULT 1;
    ->   SET @MIN = NOW() - interval 30 day;
    ->   SET @MAX = NOW();
    ->   WHILE i < 1000 DO
    ->     INSERT INTO tt(code,timestamp) VALUES (FLOOR(RAND()*1000), TIMESTAMPADD(SECOND, FLOOR(RAND() * TIMESTAMPDIFF(SECOND, @MIN, @MAX)), @MIN));
    ->     SET i = i + 1;
    ->   END WHILE;
    -> END$$
Query OK, 0 rows affected (0.01 sec)

mysql> DELIMITER ;
mysql> CALL prepare_data();
Query OK, 1 row affected (3.36 sec)

mysql> SELECT DATE(timestamp) d, COUNT(distinct code) c
    -> FROM tt
    -> WHERE timestamp >= curdate() - interval 30 day
    -> GROUP BY d
    -> ORDER BY d desc;
+------------+----+
| d          | c  |
+------------+----+
| 2020-10-19 | 17 |
| 2020-10-18 | 31 |
| 2020-10-17 | 40 |
| 2020-10-16 | 31 |
| 2020-10-15 | 43 |
| 2020-10-14 | 30 |
| 2020-10-13 | 32 |
| 2020-10-12 | 36 |
| 2020-10-11 | 29 |
| 2020-10-10 | 23 |
| 2020-10-09 | 31 |
| 2020-10-08 | 41 |
| 2020-10-07 | 23 |
| 2020-10-06 | 29 |
| 2020-10-05 | 23 |
| 2020-10-04 | 40 |
| 2020-10-03 | 32 |
| 2020-10-02 | 25 |
| 2020-10-01 | 28 |
| 2020-09-30 | 29 |
| 2020-09-29 | 28 |
| 2020-09-28 | 33 |
| 2020-09-27 | 36 |
| 2020-09-26 | 47 |
| 2020-09-25 | 28 |
| 2020-09-24 | 32 |
| 2020-09-23 | 42 |
| 2020-09-22 | 36 |
| 2020-09-21 | 35 |
| 2020-09-20 | 37 |
| 2020-09-19 | 16 |
+------------+----+
31 rows in set (0.00 sec)

mysql>
[19 Oct 2020 16:46] Freddy Andersen
After some testing this only happens on our Percona builds of MySQL server so there must be something that they have added that causes the grouping issue. I will go open a bug with Percona.
[19 Oct 2020 16:48] MySQL Verification Team
Thank you for the feedback.
[3 Nov 2020 23:15] Paweł Kubrak
Can confirm this bug on 8.0.21 on gentoo system native mysql package.
Related SO question: https://stackoverflow.com/questions/64671446/grouped-rows-are-duplicated-if-using-count-di...

Updating to 8.0.22 solved (probably?) the issue.
[3 Nov 2020 23:17] Paweł Kubrak
This however worked properly on docker 8.0.21.
Maybe hand-compiled versions are affected only? Maybe it depends on compiler flags / cpu extensions used?

@Freddy Andersen - did you finally managed to figure out what caused this problem?
[3 Nov 2020 23:40] Freddy Andersen
I have not found anything to resolve the issue yet. Have contacted Percona to see if they would be able to find the issue but its promising that someone else is also able to reproduce the issue outside of a Percona build.