Bug #110092 unique key has duplicate data.
Submitted: 16 Feb 2023 6:43 Modified: 23 Feb 2023 3:25
Reporter: nycle chen Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.7.34-log OS:CentOS (CentOS Linux release 7.6.1810 (Core))
Assigned to: CPU Architecture:Any
Tags: unique key has duplicate data.

[16 Feb 2023 6:43] nycle chen
Description:
I has a table, it likes:
create table t1
(
id bigint not null auto_increment,
code bigint,
name varchar(64),
unique key(code),
primary key(id)
)engine=innodb;

1.When I use the sql(group by) to query.
select code,count(*) cnt from t1
group by code having cnt > 1;
It return many rows. It means  the same code(code=1) has many rows.But, the code column is unique key. The result like this:
code cnt
---- ----
1    2
2    2

2.When I use the sql(no group by) to query.
select code from t1 where code=1;
It returns on row, no returns many rows.

3.When I use mysqldump the t2 table, the same code has many rows too.

4.When I use the innobackupex to backup a full instance, and restore a new instance. Then use group by sql, the same code ,it doesn't return many rows.
select code,count(*) cnt from t1
group by code having cnt > 1;

So I think if restart the instance, unique key has duplicate data, may not repeat. And It is a critical bug.

Thanks.

How to repeat:
Can not repeat.
[16 Feb 2023 13:54] MySQL Verification Team
Hi Mr. chen,

Thank you for your report.

However, it is not a bug ......

When you run the aggregated query on a single column you will always get a single row for each value of that column. Other columns, that are not aggregated are randomly chosen. This is according to SQL standards.

You have to use other features of MySQL to get what you need. They are all described in our Reference Manual that is found on http://dev.mysql.com.

Not a bug.
[23 Feb 2023 3:25] nycle chen
Thank you very much. Please reopen this report.

I think it's not just a development SQL standards issue, it's a data consistency issue.

This is the caseļ¼š
1.when I use sql to query for condition code column , then retuen one row, but use mysqldump the full table, then retuen the same code for two rows, and the code column has a nuique index. It like this:
  select * from t1 where code=1; 
  Then it return one row the code=1.

  mysqldump -hxxx -Pxxx  -uxxx -pxxx  --single-transaction --set-gtid-purged=off --databases db1  --tables t1 > ./t1.sql   
  The the t1.sql file has two rows for code=1.

2.When I use the sql(group by) to query. The  code has two rows, but I restart the instance the same code has one row. 

select code,count(*) cnt from t1
group by code having cnt > 1;
It return many rows. It means  the same code(code=1) has many rows. But the code column has unique key. It like this:
code cnt
---- ----
1    2
2    2

For the same data, when I restart the instance, it returns one row for the same code. It like this:
select code,count(*) cnt from t1
group by code having cnt > 1;
code cnt
---- ----
1    1
2    1
[23 Feb 2023 13:17] MySQL Verification Team
Hi Mr. chen,

There is no need what so ever to reopen this report.

If you have to restart the instance to get the result that you like, then you are probably using some ancient MySQL release , like 5.7.34, which is not maintained any more. Try 8.0.32.

Next, make sure that you use our binary and not binary from some third source, nor the self built one.

Not a bug.