| Bug #113060 | The result of avg(distinct xx)is different between from primary and secondary | ||
|---|---|---|---|
| Submitted: | 13 Nov 2023 10:23 | Modified: | 13 Nov 2023 12:07 |
| Reporter: | XIAOJING LI | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: DML | Severity: | S3 (Non-critical) |
| Version: | 8.0 | OS: | Any |
| Assigned to: | Assigned Account | CPU Architecture: | Any |
| Tags: | avg distinct, sum | ||
[13 Nov 2023 12:07]
MySQL Verification Team
Hi Mr. LI, Thank you for your bug report. We have managed to repeat it with 8.0.35: avg(distinct b) 3.5 avg(distinct b) 3.7 sum(distinct b) 14 sum(distinct b) 37 Verified as reported.
[20 Nov 2023 11:40]
huahua xu
Hi XIAOJING LI,
For your query statement `select avg(distinct b) from t1`, the optimizer use covering index skip scan for deduplication on t1.idx_b_l, and aggregate them, then average them.
For your query statement `select avg(distinct b) from t1 force index(primary)`, the optimizer collects all the data ('0XS2E55maqch', 'G6UHU26b8dkvwpYiM' and '0XS2E55maqch' all have been converted to 0.0) into an unique for removing of duplicates, then aggregate and average them.

Description: The result of avg(distinct xx)is different between from primary and secondary indexes. coloum xx is varchar type How to repeat: CREATE TABLE `t1` ( `a` int(11) NOT NULL AUTO_INCREMENT, `b` varchar(40) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL, `c` varchar(40) DEFAULT NULL, PRIMARY KEY (`a`), KEY `idx_b_l` (`b`), KEY `idx_c_l` (`c`) ); insert into t1(b,c) values ('6RXmcRyA48c0yvZ2I' ,'6RXmcRyA48c0yvZ2I'); insert into t1(b,c) values ('G6UHU26b8dkvwpYiM' ,'G6UHU26b8dkvwpYiM'); insert into t1(b,c) values ('0XS2E55maqch' ,'0XS2E55maqch'); insert into t1(b,c) values ('0qwCMwobKYOcmLyfRXlagA8ukENJv' ,'0qwCMwobKYOcmLyfRXlagA8ukENJv'); insert into t1(b,c) values ('3avxIDL4YPrZVHie4rOiPrK8ZIJwEPqZaioHK' ,'3avxIDL4YPrZVHie4rOiPrK8ZIJwEPqZaioHK'); insert into t1(b,c) values ('5DHC2QScT6P6lXgRtHajXOfztB7ddjYH1LLUC3' ,'5DHC2QScT6P6lXgRtHajXOfztB7ddjYH1LLUC3'); insert into t1(b,c) values ('5pdGQ2NTM3c2uRgCg9NspSE' ,'5pdGQ2NTM3c2uRgCg9NspSE'); insert into t1(b,c) values ('6Bk06GVtwZaKqg01' ,'6Bk06GVtwZaKqg01'); insert into t1(b,c) values ('6VFO4xJPwvxGzReYzVj6dwTSIechnSSCyz9iY' ,'6VFO4xJPwvxGzReYzVj6dwTSIechnSSCyz9iY'); insert into t1(b,c) values ('6hLBs3Rnd5elLLVv1ip3A2U6G1dkIApKDkiCy' ,'6hLBs3Rnd5elLLVv1ip3A2U6G1dkIApKDkiCy'); mysql> select avg(distinct b) from t1 force index(primary); +--------------------+ | avg(distinct b) | +--------------------+ | 3.5| +--------------------+ 1 row in set (0.01 sec) mysql> select avg(distinct b) from t1; +--------------------+ | avg(distinct b) | +--------------------+ | 3.7 | +--------------------+ 1 row in set, 10 warnings (0.01 sec) mysql> select sum(distinct b) from t1 force index(primary); +-----------------+ | sum(distinct b) | +-----------------+ | 14 | +-----------------+ 1 row in set (0.00 sec) mysql> select sum(distinct b) from t1; +-----------------+ | sum(distinct b) | +-----------------+ | 37 | +-----------------+ 1 row in set, 10 warnings (0.00 sec)