| Bug #108620 | Wrong count when choosing json index | ||
|---|---|---|---|
| Submitted: | 27 Sep 2022 2:39 | Modified: | 27 Sep 2022 6:06 |
| Reporter: | Shaohua Wang (OCA) | Email Updates: | |
| Status: | Can't repeat | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
| Version: | 8.0.28 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[27 Sep 2022 6:06]
MySQL Verification Team
Hello Shaohua,
Thank you for the report and test case.
Observed that 8.0.28 build is affected but this issue is no longer seen after 8.0.29+. Could you please confirm this at your end? Thank you.
== 8.0.28 affected
bin/mysql -uroot -S /tmp/mysql.sock
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 8.0.28 MySQL Community Server - GPL
Copyright (c) 2000, 2022, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> create database test;
Query OK, 1 row affected (0.01 sec)
mysql> use test
Database changed
mysql> create table t1(
move_time datetime(0),
node_tree json,
sale_count bigint,
index idx_move_time(move_time),
index idx_move_time_sale_count_tree(move_time, (cast(json_extract(node_tree, _utf8mb4'$.node') as char(32) array)), sale_count));
insert into t1 values('2022-08-22 00:00:00', '{"node": ["393459011", "industrial", "12900351", "12899801", "office-products", "1069242"]}', 549);
insert into t1 values('2022-08-22 00:00:00', '{"node": ["166099011", "166092011", "toys-and-games"]}', 11978);
insert into t1 values('2022-08-22 00:00:00', '{"node": ["1069462", "1069454", "1069242", "office-products", "490790011", "12899801"]}', 2972);
-> move_time datetime(0),
-> node_tree json,
-> sale_count bigint,
-> index idx_move_time(move_time),
-> index idx_move_time_sale_count_tree(move_time, (cast(json_extract(node_tree, _utf8mb4'$.node') as char(32) array)), sale_count));
Query OK, 0 rows affected (0.03 sec)
mysql>
mysql> insert into t1 values('2022-08-22 00:00:00', '{"node": ["393459011", "industrial", "12900351", "12899801", "office-products", "1069242"]}', 549);
Query OK, 1 row affected (0.01 sec)
mysql>
mysql> insert into t1 values('2022-08-22 00:00:00', '{"node": ["166099011", "166092011", "toys-and-games"]}', 11978);
Query OK, 1 row affected (0.00 sec)
mysql>
mysql> insert into t1 values('2022-08-22 00:00:00', '{"node": ["1069462", "1069454", "1069242", "office-products", "490790011", "12899801"]}', 2972);
Query OK, 1 row affected (0.01 sec)
mysql> select count(1) from t1 force index(idx_move_time_sale_count_tree) where move_time = '2022-8-22';
+----------+
| count(1) |
+----------+
| 15 |
+----------+
1 row in set (0.00 sec)
mysql> select distinct count(1) from t1 force index(idx_move_time_sale_count_tree) where move_time = '2022-8-22';
+----------+
| count(1) |
+----------+
| 15 |
+----------+
1 row in set (0.00 sec)
mysql> select count(1) from t1 force index(idx_move_time) where move_time = '2022-8-22';
+----------+
| count(1) |
+----------+
| 3 |
+----------+
1 row in set (0.00 sec)
mysql> select count(1) from t1 where move_time = '2022-8-22';
+----------+
| count(1) |
+----------+
| 3 |
+----------+
== 8.0.29+ not affected
bin/mysql -uroot -S /tmp/mysql.sock
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 8.0.29 MySQL Community Server - GPL
Copyright (c) 2000, 2022, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> create database test;
Query OK, 1 row affected (0.01 sec)
mysql> use test
Database changed
mysql> create table t1(
move_time datetime(0),
node_tree json,
sale_count bigint,
index idx_move_time(move_time),
index idx_move_time_sale_count_tree(move_time, (cast(json_extract(node_tree, _utf8mb4'$.node') as char(32) array)), sale_count));
insert into t1 values('2022-08-22 00:00:00', '{"node": ["393459011", "industrial", "12900351", "12899801", "office-products", "1069242"]}', 549);
insert into t1 values('2022-08-22 00:00:00', '{"node": ["166099011", "166092011", "toys-and-games"]}', 11978);
insert into t1 values('2022-08-22 00:00:00', '{"node": ["1069462", "1069454", "1069242", "office-products", "490790011", "12899801"]}', 2972);
-> move_time datetime(0),
-> node_tree json,
-> sale_count bigint,
-> index idx_move_time(move_time),
-> index idx_move_time_sale_count_tree(move_time, (cast(json_extract(node_tree, _utf8mb4'$.node') as char(32) array)), sale_count));
Query OK, 0 rows affected (0.14 sec)
mysql>
mysql> insert into t1 values('2022-08-22 00:00:00', '{"node": ["393459011", "industrial", "12900351", "12899801", "office-products", "1069242"]}', 549);
Query OK, 1 row affected (0.01 sec)
mysql>
mysql> insert into t1 values('2022-08-22 00:00:00', '{"node": ["166099011", "166092011", "toys-and-games"]}', 11978);
Query OK, 1 row affected (0.00 sec)
mysql>
mysql> insert into t1 values('2022-08-22 00:00:00', '{"node": ["1069462", "1069454", "1069242", "office-products", "490790011", "12899801"]}', 2972);
Query OK, 1 row affected (0.01 sec)
mysql> select count(1) from t1 force index(idx_move_time_sale_count_tree) where move_time = '2022-8-22';
+----------+
| count(1) |
+----------+
| 3 |
+----------+
1 row in set (0.00 sec)
mysql> select distinct count(1) from t1 force index(idx_move_time_sale_count_tree) where move_time = '2022-8-22';
+----------+
| count(1) |
+----------+
| 3 |
+----------+
1 row in set (0.00 sec)
mysql> select count(1) from t1 force index(idx_move_time) where move_time = '2022-8-22';
+----------+
| count(1) |
+----------+
| 3 |
+----------+
1 row in set (0.00 sec)
mysql> select count(1) from t1 where move_time = '2022-8-22';
+----------+
| count(1) |
+----------+
| 3 |
+----------+
Most likely fixed in 8.0.29 after - SELECT COUNT(*) using a multi-valued index reported the wrong number of rows. (Bug #104898, Bug #33334928)
More https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-29.html
regards,
Umesh

Description: when we do a SELECT COUNT(*) in a table with json index, we get wrong result if the json index is chosen in query plan. How to repeat: create table t1( move_time datetime(0), node_tree json, sale_count bigint, index idx_move_time(move_time), index idx_move_time_sale_count_tree(move_time, (cast(json_extract(node_tree, _utf8mb4'$.node') as char(32) array)), sale_count)); insert into t1 values('2022-08-22 00:00:00', '{"node": ["393459011", "industrial", "12900351", "12899801", "office-products", "1069242"]}', 549); insert into t1 values('2022-08-22 00:00:00', '{"node": ["166099011", "166092011", "toys-and-games"]}', 11978); insert into t1 values('2022-08-22 00:00:00', '{"node": ["1069462", "1069454", "1069242", "office-products", "490790011", "12899801"]}', 2972); #we got 15 here, which is wrong. select count(1) from t1 force index(idx_move_time_sale_count_tree) where move_time = '2022-8-22'; #we got 15 here, which is wrong. select distinct count(1) from t1 force index(idx_move_time_sale_count_tree) where move_time = '2022-8-22'; select count(1) from t1 force index(idx_move_time) where move_time = '2022-8-22'; select count(1) from t1 where move_time = '2022-8-22'; drop table t1; Suggested fix: Avoid to use json index in this cases, or show distinct in count.