Bug #105316 sql produces different resultsets when different internel storage engines r used
Submitted: 26 Oct 2021 3:50 Modified: 1 Dec 2021 23:53
Reporter: Brian Yue (OCA) Email Updates:
Status: Closed Impact on me:
Category:MySQL Server: Options Severity:S1 (Critical)
Version:8.0.25, 8.0.27, 8.0.11 OS:Any (rhel-7.4)
Assigned to: CPU Architecture:Any (x86-64)
Tags: Contribution, incorrect resultset, temptable

[26 Oct 2021 3:50] Brian Yue
  Here we find a case that different resultsets are produced when internal_tmp_mem_storage_engines are used (Memory and Temptable). Please reference to `How to repeat`.

How to repeat:
mysql> create database bug;
Query OK, 1 row affected (0.00 sec)

mysql> use bug
Database changed
mysql> create table t1 (c1 varchar(10) collate utf8_bin) engine = innodb;
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> insert into t1 values (''), (' ');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select distinct(c1) from t1;
| c1   |
|      |
|      |
2 rows in set (0.00 sec)

mysql> show variables like '%inte%';
| Variable_name                   | Value     |
| innodb_api_bk_commit_interval   | 5         |
| interactive_timeout             | 28800     |
| internal_tmp_mem_storage_engine | TempTable |
| myisam_data_pointer_size        | 6         |
| password_reuse_interval         | 0         |
5 rows in set (0.01 sec)

mysql> set @@session. internal_tmp_mem_storage_engine= memory;
Query OK, 0 rows affected (0.00 sec)

mysql> select distinct(c1) from t1;
| c1   |
|      |
1 row in set (0.00 sec)


Suggested fix:
Later I'll upload my patch
[26 Oct 2021 6:35] MySQL Verification Team
Hello Brian Yue,

Thank you for the report and feedback.
Observed that 8.0.27 release build affected.

[27 Oct 2021 15:21] Brian Yue
fixed basing on version MySQL8.0.25

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug_fix.txt (text/plain), 1.09 KiB.

[28 Oct 2021 6:25] MySQL Verification Team
Thank you for the contribution.

[1 Dec 2021 23:53] Jon Stephens
Documented fix as follows in the MySQL 8.0.29 changelog:

    SELECT DISTINCT on a VARCHAR column returned 2 distinct values
    when the coloumn contained an empty string and a string
    consisting of a single whitespace character when the TempTable
    storage engine was used (internal_tmp_mem_storage_engine =
    TempTable). This behavior was incorrect when using a utf8_bin
    collation with PAD SPACE; trailing spaces should have been
    ignored when selecting distinct values.

    For temporary tables created using the TempTable storage engine,
    the two strings were hashed to different values, since the
    TempTable hashing function always hashed zero-length data to 0.
    (In the case described, the empty string was hashed to 0 while
    the space character from the other string was used to generate a
    nonzero hash value.

    This is fixed by removing any special handling for zero-length
    data in the TempTable engine's hashing function.

    Our thanks to Brian Yue for the contribution.