Bug #101926 select count(distinct(case ...)) sometimes returns 0
Submitted: 9 Dec 2020 10:35 Modified: 9 Dec 2020 10:48
Reporter: Jan Andersen Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:5.6.10 OS:Linux (AWS Aurora RDS)
Assigned to: CPU Architecture:Any

[9 Dec 2020 10:35] Jan Andersen
Description:
This query returns a 0 where it should have returned a value > 0:

select 
  count(game_action_id) ga_count,
  count(distinct(game_action_id)) count_distinct,
  count(case when type='WAGER' then game_action_id end) wager_count,
  count(distinct(case when type='WAGER' then game_action_id end)) wager_count_distinct 
from ga_tst 
where created_timestamp >= '2020-11-01 00:00:00' and created_timestamp <= '2020-11-01 23:59:59';

It works correctly for dates outside November, and doesn't work when the interval is in November. The table is defined as:

CREATE TABLE `game_action` (
  `game_action_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `game_id` int(11) NOT NULL,
  `user_id` int(11) NOT NULL,
  `game_instance_id` bigint(20) unsigned DEFAULT NULL,
  `type` varchar(15) NOT NULL,
  `amount` decimal(18,2) NOT NULL,
  `currency` varchar(15) NOT NULL,
  `created_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`game_action_id`,`created_timestamp`),
  KEY `GA_IX01` (`game_id`),
  KEY `GA_IX02` (`user_id`),
  KEY `GA_IX03` (`game_instance_id`)
) ENGINE=InnoDB AUTO_INCREMENT=447579828 DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE ( UNIX_TIMESTAMP(created_timestamp))
(PARTITION pMIN VALUES LESS THAN (1388534400) ENGINE = InnoDB,
 PARTITION p2014_01 VALUES LESS THAN (1391212800) ENGINE = InnoDB,
 PARTITION p2014_02 VALUES LESS THAN (1393632000) ENGINE = InnoDB,
 PARTITION p2014_03 VALUES LESS THAN (1396310400) ENGINE = InnoDB,
 PARTITION p2014_04 VALUES LESS THAN (1398902400) ENGINE = InnoDB,
...
 PARTITION p2020_08 VALUES LESS THAN (1598918400) ENGINE = InnoDB,
 PARTITION p2020_09 VALUES LESS THAN (1601510400) ENGINE = InnoDB,
 PARTITION p2020_10 VALUES LESS THAN (1604188800) ENGINE = InnoDB,
 PARTITION p2020_11 VALUES LESS THAN (1606780800) ENGINE = InnoDB,
 PARTITION p2020_12 VALUES LESS THAN (1609459200) ENGINE = InnoDB,
 PARTITION pMAX VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */;

=======

If I change the query to the following, it works correctly:

select 
  count(ga1.game_action_id) ga_count,
  count(distinct(ga1.game_action_id)) count_distinct,
  count(case when ga1.type='WAGER' then ga1.game_action_id end) wager_count,
  count(distinct(case when ga1.type='WAGER' then ga1.game_action_id end)) wager_count_distinct 
from (
  select * from game_action ga2
  where ga2.created_timestamp >= '2020-11-01 00:00:00' 
    and ga2.created_timestamp <= '2020-11-01 23:59:59'
) ga1;

How to repeat:
It is not easy to provide sample dataset; the table contains 449043720 rows, and I in trying to find a minimal dataset, I copied just November to a separate table, create with identical indexes and partitions - and the query works correctly on this.
[9 Dec 2020 10:48] MySQL Verification Team
Thank you for the bug report. The version 5.6.10 is unsupported and you not provided a repeatable test case, the current version released is 5.6.50 please try it.

https://dev.mysql.com/downloads/mysql/