Bug #104897 | Multi-valued index performance is too slow (Access too many rows than required) | ||
---|---|---|---|
Submitted: | 10 Sep 2021 6:29 | Modified: | 6 Aug 13:18 |
Reporter: | Seunguck Lee | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
Version: | 8.0.26, 8.0.17 | OS: | Any |
Assigned to: | CPU Architecture: | x86 | |
Tags: | multi-valued index |
[10 Sep 2021 6:29]
Seunguck Lee
[10 Sep 2021 6:39]
Seunguck Lee
This is part of full test data. I have uploaded full test data via SFTP
Attachment: test_mvi-partial.sql.tar.gz (application/x-gzip, text), 264.30 KiB.
[10 Sep 2021 9:52]
MySQL Verification Team
Hello Seunguck Lee, Thank you for the report and test case. regards, Umesh
[28 Sep 2022 9:37]
Gyeongnam Kim
I could reproduce this issue in 8.0.28, and checked there is issue with index ordering. I added the index with changed ordering and just it makes count different. I loaded in the SQL file and double it some times. mysql> insert into test_mvi(type, p_at,hids) SELECT type, p_at,hids FROM test_mvi; Query OK, 8380 rows affected (0.32 sec) Records: 8380 Duplicates: 0 Warnings: 0 mysql> insert into test_mvi(type, p_at,hids) SELECT type, p_at,hids FROM test_mvi; Query OK, 16760 rows affected (0.42 sec) Records: 16760 Duplicates: 0 Warnings: 0 mysql> insert into test_mvi(type, p_at,hids) SELECT type, p_at,hids FROM test_mvi; Query OK, 33520 rows affected (0.96 sec) Records: 33520 Duplicates: 0 Warnings: 0 mysql> insert into test_mvi(type, p_at,hids) SELECT type, p_at,hids FROM test_mvi; Query OK, 67040 rows affected (1.95 sec) Records: 67040 Duplicates: 0 Warnings: 0 mysql> insert into test_mvi(type, p_at,hids) SELECT type, p_at,hids FROM test_mvi; Query OK, 134080 rows affected (5.67 sec) Records: 134080 Duplicates: 0 Warnings: 0 I added index with different ordering with exist index.. mysql> ALTER TABLE test_mvi DROP KEY ix_type_hid_pat_2, ADD KEY ix_type_hid_pat_2(`type`, `p_at`, (cast(json_extract(`hids`,_utf8mb4'$') as unsigned array))); Query OK, 268160 rows affected (10.94 sec) Records: 268160 Duplicates: 0 Warnings: 0 mysql> SHOW CREATE TABLE test_mvi\G *************************** 1. row *************************** Table: test_mvi Create Table: CREATE TABLE `test_mvi` ( `id` bigint NOT NULL AUTO_INCREMENT, `type` enum('T1','T2','T3') NOT NULL, `p_at` datetime(6) NOT NULL, `hids` json DEFAULT (json_array()), PRIMARY KEY (`id`), KEY `ix_type_hid_pat` (`type`,(cast(json_extract(`hids`,_utf8mb4'$') as unsigned array)),`p_at`), KEY `ix_type_hid_pat_2` (`type`,`p_at`,(cast(json_extract(`hids`,_utf8mb4'$') as unsigned array))) ) ENGINE=InnoDB AUTO_INCREMENT=450741 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci 1 row in set (0.08 sec) And I ran same query but using other indexes, and checked there are differences. Although it's same query without hint, counts are different. -- with the new index order, it returns more counts. mysql> SELECT COUNT(*) FROM test_mvi USE INDEX (ix_type_hid_pat_2) WHERE type IN ('T1') AND JSON_OVERLAPS(hids->'$', CAST('[613349414265683967, 617853013892792319, 613349377578106879, 617852977203904511]' AS JSON)) ; +----------+ | COUNT(*) | +----------+ | 896 | +----------+ 1 row in set (4.49 sec) -- with other index - it returns row normally. mysql> SELECT COUNT(*) FROM test_mvi USE INDEX (ix_type_hid_pat) WHERE type IN ('T1') AND JSON_OVERLAPS(hids->'$', CAST('[613349414265683967, 617853013892792319, 613349377578106879, 617852977203904511]' AS JSON)) ; +----------+ | COUNT(*) | +----------+ | 224 | +----------+ 1 row in set (0.06 sec)
[28 Sep 2022 12:39]
MySQL Verification Team
Hi Mr. Kim, Thank you for your contribution. It will be very useful when it comes to improving the performance of queries that use many tuples in an index. However, there is no RDBMS that will automatically create the best possible index for its future usage. That is why this job is left to DBA's, schema normalisers, query tuners etc. Simply a RDBMS can not predict what will the index be used for. It is , however, possible to make the analysis like yours. Thank you for your contribution.
[6 Aug 13:18]
Jon Stephens
Documented fix as follows in the MySQL 8.0.40, 8.4.3, and 9.1.0 changelogs: A query using a greater-than (>) or less-than (<) comparison with a multi-valued index executed much more slowly than the same query using an equality (=) comparison with the same index. Closed.
[6 Aug 13:20]
MySQL Verification Team
Thank you Jon.