| Bug #100885 | generated column index against json column is not used with different encoding | ||
|---|---|---|---|
| Submitted: | 18 Sep 2020 3:43 | Modified: | 18 Sep 2020 7:28 |
| Reporter: | Nikolai Ikhalainen | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: JSON | Severity: | S5 (Performance) |
| Version: | 5.7.31 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | character set, generated columns, json, Optimizer | ||
[18 Sep 2020 7:28]
MySQL Verification Team
Hello Nikolai, Thank you for the report and feedback. Thanks, Umesh
[9 Nov 2020 11:38]
Eduardo Ubide
Expanding on Nickolay's case, I think it's something that affects the index's integrity due to the session's charset at insert time.
Regardless of the definition of the table. If we add data in a table from different "set names" the index on a generated column containing JSON extract expression is unusable.
I checked it on 5.7.29 and 5.7.31.
# Example:
set names utf8mb4;
mysql5731> INSERT INTO `json_test` (`id`, `j`) VALUES (3,'{\"a\": \"1\"}'),(4,'{\"a\": 1}');
Query OK, 2 rows affected (0.01 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql5731> explain select id,JSON_TYPE(j->"$.a") from json_test where `j`->'$."a"' = '1';
+----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------+
| 1 | SIMPLE | json_test | NULL | ref | a | a | 3075 | const | 2 | 100.00 | NULL |
+----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
mysql5731> set names utf8;
Query OK, 0 rows affected (0.00 sec)
mysql5731> explain select id,JSON_TYPE(j->"$.a") from json_test where `j`->'$."a"' = '1';
+----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | json_test | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | Using where |
+----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql5731> explain select id,JSON_TYPE(j->"$.a") from json_test where json_extract(`j`,_utf8mb4'$."a"') = '1';
+----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------+
| 1 | SIMPLE | json_test | NULL | ref | a | a | 3075 | const | 2 | 100.00 | NULL |
+----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
mysql5731> INSERT INTO `json_test` (`id`, `j`) VALUES (5,'{\"a\": \"1\"}'),(6,'{\"a\": 1}');
Query OK, 2 rows affected (0.01 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql5731> explain select id,JSON_TYPE(j->"$.a") from json_test where json_extract(`j`,_utf8mb4'$."a"') = '1';
+----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | json_test | NULL | ALL | a | NULL | NULL | NULL | 4 | 100.00 | Using where |
+----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.01 sec)
mysql5731> set names utf8mb4 ;
Query OK, 0 rows affected (0.00 sec)
mysql5731> explain select id,JSON_TYPE(j->"$.a") from json_test where `j`->'$."a"' = '1';
+----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | json_test | NULL | ALL | a | NULL | NULL | NULL | 4 | 100.00 | Using where |
+----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql5731> explain select id,JSON_TYPE(j->"$.a") from json_test where json_extract(`j`,_utf8mb4'$."a"') = '1';
+----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | json_test | NULL | ALL | a | NULL | NULL | NULL | 4 | 100.00 | Using where |
+----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
# How to repeat:
docker run --name m5731 -d -e MYSQL_ROOT_PASSWORD=secret mysql/mysql-server:5.7.31
docker exec -it m5731 mysql --prompt='mysql5731> ' -psecret
set names utf8mb4;
create database test default character set utf8mb4;
use test
DROP TABLE IF EXISTS `json_test`;
CREATE TABLE `json_test` (
`id` int unsigned NOT NULL AUTO_INCREMENT,
`j` json DEFAULT NULL,
`c` varchar(768) GENERATED ALWAYS AS (json_unquote(json_extract(`j`,'$."a"'))) VIRTUAL,
PRIMARY KEY (`id`),
KEY `a` (`c`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
-- utf8mb4
INSERT INTO `json_test` (`id`, `j`) VALUES (3,'{\"a\": \"1\"}'),(4,'{\"a\": 1}');
explain select id,JSON_TYPE(j->"$.a") from json_test where `j`->'$."a"' = '1';
set names utf8;
explain select id,JSON_TYPE(j->"$.a") from json_test where `j`->'$."a"' = '1';
explain select id,JSON_TYPE(j->"$.a") from json_test where json_extract(`j`,_utf8mb4'$."a"') = '1';
-- utf8
INSERT INTO `json_test` (`id`, `j`) VALUES (5,'{\"a\": \"1\"}'),(6,'{\"a\": 1}');
explain select id,JSON_TYPE(j->"$.a") from json_test where json_extract(`j`,_utf8mb4'$."a"') = '1';
set names utf8mb4;
explain select id,JSON_TYPE(j->"$.a") from json_test where `j`->'$."a"' = '1';
explain select id,JSON_TYPE(j->"$.a") from json_test where json_extract(`j`,_utf8mb4'$."a"') = '1';

Description: The index on generated column containing json extract expression (json_unquote(json_extract(`j`,'$."a"'))) "remembers" character set for '$."a"' string. After changing client encoding with "set names" the index is not used. Starting from MySQL 8.0.17, the optimizer is smart enough to handle both cases and use indexes. mysql5731> explain select id,JSON_TYPE(j->"$.a") from json_test where `j`->'$."a"' = '1'; +----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------+ | 1 | SIMPLE | json_test | NULL | ref | a | a | 3075 | const | 2 | 100.00 | NULL | +----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------+ 1 row in set, 1 warning (0.00 sec) mysql5731> set names utf8; Query OK, 0 rows affected (0.00 sec) mysql5731> explain select id,JSON_TYPE(j->"$.a") from json_test where `j`->'$."a"' = '1'; +----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | json_test | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | Using where | +----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec) mysql5731> explain select id,JSON_TYPE(j->"$.a") from json_test where json_extract(`j`,_utf8mb4'$."a"') = '1'; +----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------+ | 1 | SIMPLE | json_test | NULL | ref | a | a | 3075 | const | 2 | 100.00 | NULL | +----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------+ 1 row in set, 1 warning (0.00 sec) How to repeat: docker run --name m5731 -d -e MYSQL_ROOT_PASSWORD=secret mysql/mysql-server:5.7.31 docker exec -it m5731 mysql --prompt='mysql5731> ' -psecret set names utf8mb4; create database test default character set utf8mb4; use test DROP TABLE IF EXISTS `json_test`; CREATE TABLE `json_test` ( `id` int unsigned NOT NULL AUTO_INCREMENT, `j` json DEFAULT NULL, `c` varchar(768) GENERATED ALWAYS AS (json_unquote(json_extract(`j`,'$."a"'))) VIRTUAL, PRIMARY KEY (`id`), KEY `a` (`c`) ) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; INSERT INTO `json_test` (`id`, `j`) VALUES (3,'{\"a\": \"1\"}'),(4,'{\"a\": 1}'); explain select id,JSON_TYPE(j->"$.a") from json_test where `j`->'$."a"' = '1'; set names utf8; explain select id,JSON_TYPE(j->"$.a") from json_test where `j`->'$."a"' = '1'; explain select id,JSON_TYPE(j->"$.a") from json_test where json_extract(`j`,_utf8mb4'$."a"') = '1'; Suggested fix: Make the behavior consistent between 5.7 and 8.0 (e.g. use the index regardless to encoding) or document requirement to keep character set at https://dev.mysql.com/doc/refman/5.7/en/create-table-secondary-indexes.html