Bug #115908 Poor performance in scanning the entire table in non indexed field order
Submitted: 23 Aug 2024 5:55 Modified: 26 Aug 2024 7:43
Reporter: chong zhang Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:5.7.21,8.0.25 OS:Any
Assigned to: CPU Architecture:Any

[23 Aug 2024 5:55] chong zhang
Description:
After upgrading from 5.7.21 to 8.0.25, the performance of full table scanning with order by non indexed fields deteriorates significantly

How to repeat:
# 1、 create table 
create table t1 (
  id int auto_increment primary key,
  col2 int,
  col3 char(255),
  col4 varchar(8192),
  paytime int unsigned DEFAULT NULL
) ENGINE=INNODB;

# 2、Produce numbers
DELIMITER //
CREATE PROCEDURE insert_data()
BEGIN
    DECLARE i INT DEFAULT 0;

    WHILE i < 300000 DO
        INSERT INTO t1 VALUES (null, i, repeat('a', 255), repeat('a', 8192), FLOOR(RAND() * 300000));
        SET i = i + 1;
    END WHILE;
END //

DELIMITER ;

CALL insert_data;

#3、repeat 
SELECT * FROM t1 ORDER BY paytime LIMIT 1;

-- Cache the data in memory , MySQL 5.7.21 takes 0.11 seconds to execute, MySQL 8.0.25 takes 3.06 seconds to execute

mysql> select version();
+------------+
| version()  |
+------------+
| 5.7.21-log |
+------------+
1 row in set (0.00 sec)

mysql> SELECT * FROM t1 ORDER BY paytime LIMIT 1;
1 row in set (0.11 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.25    |
+-----------+
1 row in set (0.00 sec)

mysql> SELECT * FROM t1 ORDER BY paytime LIMIT 1;
1 row in set (3.06 sec)

Suggested fix:
1、Is it a bug ?

2、What are the reasons for the significant differences in performance ?
[23 Aug 2024 15:28] chong zhang
mysql> explain SELECT * FROM t1 ORDER BY paytime LIMIT 1;
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra          |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+
|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 293904 |   100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+
1 row in set, 1 warning (0.00 sec)
[24 Aug 2024 21:04] chong zhang
Why choose full field sorting instead of rowid sorting in 8.0 ?
[25 Aug 2024 0:14] Marc Reilly
Hey Chong, fyi there are a few very similar bugs to this open which may have some useful info:
https://bugs.mysql.com/bug.php?id=115555
https://bugs.mysql.com/bug.php?id=110998
https://bugs.mysql.com/bug.php?id=108531

From what I can see, the regressions have been introduced in 8.0.20 as a result of this change
- Important Change: Previously, including any column of a blob type larger than TINYBLOB or BLOB as the payload in an ordering operation caused the server to revert to sorting row IDs only, rather than complete rows; this resulted in a second pass to fetch the rows themselves from disk after the sort was completed. Since JSON and GEOMETRY columns are implemented internally as LONGBLOB, this caused the same behavior with these types of columns even though they are almost always much shorter than the 4GB maximum for LONGBLOB (or even the 16 MB maximum for MEDIUMBLOB). The server now converts columns of these types into packed addons in such cases, just as it does TINYBLOB and BLOB columns, which in testing showed a significant performance increase. The handling of MEDIUMBLOB and LONGBLOB columns in this regard remains unchanged.
- One effect of this enhancement is that it is now possible for Out of memory errors to occur when trying to sort rows containing very large (multi-megabtye) JSON or GEOMETRY column values if the sort buffers are of insufficient size; this can be compensated for in the usual fashion by increasing the value of the sort_buffer_size system variable. (Bug #30400985, Bug #30804356)

https://github.com/mysql/mysql-server/commit/e0e137031d3246a51bf1ef551bfd0eafa25efcf6
https://github.com/mysql/mysql-server/commit/37e372ae3afd1188de24311ef9f4e96e009e911b (Referenced in Bug #110998 )
[26 Aug 2024 7:43] MySQL Verification Team
Hello chong zhang,

Thank you for the report and feedback.
This is most likely duplicate of Bug #115555, please see Bug #115555.

regards,
Umesh
[26 Aug 2024 7:44] MySQL Verification Team
5.7.44 & 8.0.39 test results

Attachment: 115908_5.7.44_8.0.39.txt (text/plain), 21.89 KiB.

[27 Aug 2024 2:58] Huaxiong Song
Hi, Chong Zhang.

The parameter max_length_for_sort_data was removed in 8020. This parameter controls this behavior.

Later, this parameter was removed. The purpose is to avoid going back to the table. This is beneficial for limiting multiple rows of data. However, in this case, limit 1, the cost of the second table lookup is actually very low, but the cost of sorting is higher, so performance degradation occurs.