Bug #76834 | Different result when using index for jsn_extract | ||
---|---|---|---|
Submitted: | 25 Apr 2015 11:07 | Modified: | 7 Oct 2015 9:41 |
Reporter: | Daniël van Eeden (OCA) | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: JSON | Severity: | S3 (Non-critical) |
Version: | 5.7.7-labs-json, 5.7.9 | OS: | Any |
Assigned to: | Assigned Account | CPU Architecture: | Any |
Tags: | json |
[25 Apr 2015 11:07]
Daniël van Eeden
[25 Apr 2015 11:10]
Daniël van Eeden
And the behaviour with quotes: mysql [localhost] {msandbox} (test) > select * from t1 where jsn_extract(j, '$.foo')='"test"'; +----+-----------------+--------+ | id | j | foo | +----+-----------------+--------+ | 2 | {"foo": "test"} | "test" | +----+-----------------+--------+ 1 row in set (0.00 sec) mysql [localhost] {msandbox} (test) > select * from t1 ignore index(foo) where jsn_extract(j, '$.foo')='"test"'; Empty set (0.00 sec)
[25 Apr 2015 13:39]
MySQL Verification Team
Hello Daniël, Thank you for the report. Thanks, Umesh
[25 Apr 2015 13:40]
MySQL Verification Team
// mysql> show variables like "%version%"; +-------------------------+------------------------------+ | Variable_name | Value | +-------------------------+------------------------------+ | innodb_version | 5.7.7 | | protocol_version | 10 | | slave_type_conversions | | | version | 5.7.7-labs-json | | version_comment | MySQL Community Server (GPL) | | version_compile_machine | x86_64 | | version_compile_os | linux-el6 | +-------------------------+------------------------------+ 7 rows in set (0.01 sec) mysql> create database test; use test; Query OK, 1 row affected (0.00 sec) Database changed mysql> CREATE TABLE `t1` ( -> `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, -> `j` json DEFAULT NULL, -> `foo` varchar(30) GENERATED ALWAYS AS (jsn_extract(j, '$.foo')) VIRTUAL, -> UNIQUE KEY `id` (`id`), -> KEY `foo` (`foo`) -> ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1; Query OK, 0 rows affected (0.06 sec) mysql> insert into t1(id,j) values(NULL,'{"foo": "test"}'); Query OK, 1 row affected (0.01 sec) mysql> select * from t1; +----+-----------------+--------+ | id | j | foo | +----+-----------------+--------+ | 4 | {"foo": "test"} | "test" | +----+-----------------+--------+ 1 row in set (0.00 sec) mysql> select * from t1 ignore index(foo) where jsn_extract(j, '$.foo')='test'; +----+-----------------+--------+ | id | j | foo | +----+-----------------+--------+ | 4 | {"foo": "test"} | "test" | +----+-----------------+--------+ 1 row in set (0.00 sec) mysql> select * from t1 where jsn_extract(j, '$.foo')='test'; Empty set (0.02 sec) mysql> select * from t1 where jsn_extract(j, '$.foo')='"test"'; +----+-----------------+--------+ | id | j | foo | +----+-----------------+--------+ | 4 | {"foo": "test"} | "test" | +----+-----------------+--------+ 1 row in set (0.00 sec) mysql> select * from t1 ignore index(foo) where jsn_extract(j, '$.foo')='"test"'; Empty set (0.00 sec)
[21 May 2015 4:46]
Erlend Dahl
[19 May 2015 0:48] Evgeny Potemkin Not a bug, but a feature. In JSON strings have special treatment, as they could represent a scalar string and a JSON string. Server can't tell one from another, so user have to quote scalar strings. For consistency reasons JSON valued jsn_* funcs also return quoted strings. When no matching index is available, unquoting is handled by JSON comparator internally and invisible to user. When index is used, JSON comparator can't be used and unquoting have to be handled by user. All this leads to reported behavior. In short - user has to create string indexes with jsn_unquote function, e.g ... gc varchar(...) as (jsn_unquote(jsn_extract(...))) ...
[6 Oct 2015 13:38]
Daniël van Eeden
Also happens on 5.7.9 mysql [localhost] {msandbox} (test) > SHOW CREATE TABLE t1\G *************************** 1. row *************************** Table: t1 Create Table: CREATE TABLE `t1` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `j` json DEFAULT NULL, `foo` varchar(30) GENERATED ALWAYS AS (json_extract(j, '$.foo')) VIRTUAL, UNIQUE KEY `id` (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql [localhost] {msandbox} (test) > SELECT * FROM t1; +----+-----------------+--------+ | id | j | foo | +----+-----------------+--------+ | 2 | {"foo": "test"} | "test" | +----+-----------------+--------+ 1 row in set (0.00 sec) mysql [localhost] {msandbox} (test) > select * from t1 where json_extract(j, '$.foo')='test'; +----+-----------------+--------+ | id | j | foo | +----+-----------------+--------+ | 2 | {"foo": "test"} | "test" | +----+-----------------+--------+ 1 row in set (0.00 sec) mysql [localhost] {msandbox} (test) > select * from t1 where json_extract(j, '$.foo')='"test"'; Empty set (0.00 sec) mysql [localhost] {msandbox} (test) > ALTER TABLE t1 ADD INDEX(foo); Query OK, 0 rows affected (0.04 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql [localhost] {msandbox} (test) > select * from t1 where json_extract(j, '$.foo')='test'; Empty set (0.00 sec) mysql [localhost] {msandbox} (test) > select * from t1 where json_extract(j, '$.foo')='"test"'; +----+-----------------+--------+ | id | j | foo | +----+-----------------+--------+ | 2 | {"foo": "test"} | "test" | +----+-----------------+--------+ 1 row in set (0.00 sec)
[6 Oct 2015 14:37]
Daniël van Eeden
mysql [localhost] {msandbox} (test) > SHOW CREATE TABLE t1\G *************************** 1. row *************************** Table: t1 Create Table: CREATE TABLE `t1` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `j` json DEFAULT NULL, `foo` varchar(30) GENERATED ALWAYS AS (json_extract(j, '$.foo')) VIRTUAL, UNIQUE KEY `id` (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql [localhost] {msandbox} (test) > select *, json_extract(j, '$.foo')='test' as j2 from t1; +----+----------------------------+-------------------+------+ | id | j | foo | j2 | +----+----------------------------+-------------------+------+ | 2 | {"foo": "test"} | "test" | 1 | | 3 | {"foo": 5} | 5 | 0 | | 4 | {"foo": {"bar": "barfoo"}} | {"bar": "barfoo"} | 0 | | 5 | {"foo": true} | true | 0 | | 7 | {"bar": 7.2} | NULL | NULL | | 8 | {"foo": 7.2} | 7.2 | 0 | | 9 | {"foo": null} | null | 0 | +----+----------------------------+-------------------+------+ 7 rows in set (0.00 sec) mysql [localhost] {msandbox} (test) > select *, json_extract(j, '$.foo')='true' as j2 from t1; +----+----------------------------+-------------------+------+ | id | j | foo | j2 | +----+----------------------------+-------------------+------+ | 2 | {"foo": "test"} | "test" | 0 | | 3 | {"foo": 5} | 5 | 0 | | 4 | {"foo": {"bar": "barfoo"}} | {"bar": "barfoo"} | 0 | | 5 | {"foo": true} | true | 0 | | 7 | {"bar": 7.2} | NULL | NULL | | 8 | {"foo": 7.2} | 7.2 | 0 | | 9 | {"foo": null} | null | 0 | +----+----------------------------+-------------------+------+ 7 rows in set (0.00 sec) mysql [localhost] {msandbox} (test) > ALTER TABLE t1 ADD INDEX(foo); Query OK, 0 rows affected (0.04 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql [localhost] {msandbox} (test) > select *, json_extract(j, '$.foo')='test' as j2 from t1; +----+----------------------------+-------------------+------+ | id | j | foo | j2 | +----+----------------------------+-------------------+------+ | 2 | {"foo": "test"} | "test" | 1 | | 3 | {"foo": 5} | 5 | 0 | | 4 | {"foo": {"bar": "barfoo"}} | {"bar": "barfoo"} | 0 | | 5 | {"foo": true} | true | 0 | | 7 | {"bar": 7.2} | NULL | NULL | | 8 | {"foo": 7.2} | 7.2 | 0 | | 9 | {"foo": null} | null | 0 | +----+----------------------------+-------------------+------+ 7 rows in set (0.00 sec) mysql [localhost] {msandbox} (test) > select *, json_extract(j, '$.foo')='true' as j2 from t1; +----+----------------------------+-------------------+------+ | id | j | foo | j2 | +----+----------------------------+-------------------+------+ | 2 | {"foo": "test"} | "test" | 0 | | 3 | {"foo": 5} | 5 | 0 | | 4 | {"foo": {"bar": "barfoo"}} | {"bar": "barfoo"} | 0 | | 5 | {"foo": true} | true | 0 | | 7 | {"bar": 7.2} | NULL | NULL | | 8 | {"foo": 7.2} | 7.2 | 0 | | 9 | {"foo": null} | null | 0 | +----+----------------------------+-------------------+------+ 7 rows in set (0.00 sec) mysql [localhost] {msandbox} (test) > SELECT * FROM t1 WHERE json_extract(j, '$.foo')='test'; Empty set (0.00 sec) mysql [localhost] {msandbox} (test) > SELECT * FROM t1 WHERE json_extract(j, '$.foo')='true'; +----+---------------+------+ | id | j | foo | +----+---------------+------+ | 5 | {"foo": true} | true | +----+---------------+------+ 1 row in set (0.00 sec) mysql [localhost] {msandbox} (test) > ALTER TABLE t1 DROP INDEX foo; Query OK, 0 rows affected (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql [localhost] {msandbox} (test) > SELECT * FROM t1 WHERE json_extract(j, '$.foo')='true'; Empty set (0.00 sec) mysql [localhost] {msandbox} (test) > SELECT * FROM t1 WHERE json_extract(j, '$.foo')='test'; +----+-----------------+--------+ | id | j | foo | +----+-----------------+--------+ | 2 | {"foo": "test"} | "test" | +----+-----------------+--------+ 1 row in set (0.00 sec)
[7 Oct 2015 7:21]
Daniël van Eeden
This is how PostgreSQL 9.4 behaves with a similar table/index: postgres=# SELECT VERSION(); version ----------------------------------------------------------------------------------------------- PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.9.2-10) 4.9.2, 64-bit (1 row) postgres=# CREATE SCHEMA jtest; CREATE SCHEMA postgres=# CREATE TABLE jtest.t1 ( postgres(# id SERIAL, postgres(# j json DEFAULT NULL postgres(# ); CREATE TABLE postgres=# INSERT INTO jtest.t1(j) VALUES('{"foo": "test"}'); INSERT 0 1 postgres=# SELECT *, json_extract_path(j, 'foo') FROM jtest.t1; id | j | json_extract_path ----+-----------------+------------------- 1 | {"foo": "test"} | "test" (1 row) postgres=# SELECT *, j->>'foo' FROM jtest.t1; id | j | ?column? ----+-----------------+---------- 1 | {"foo": "test"} | test (1 row) postgres=# SELECT *, j->>'foo' FROM jtest.t1 WHERE j->>'foo' = 'test'; id | j | ?column? ----+-----------------+---------- 1 | {"foo": "test"} | test (1 row) postgres=# EXPLAIN SELECT *, j->>'foo' FROM jtest.t1 WHERE j->>'foo' = 'test'; QUERY PLAN ---------------------------------------------------- Seq Scan on t1 (cost=0.00..28.46 rows=6 width=36) Filter: ((j ->> 'foo'::text) = 'test'::text) (2 rows) postgres=# CREATE INDEX idxjson1 ON jtest.t1 ((j->>'foo')); CREATE INDEX postgres=# SET enable_seqscan = OFF; SET postgres=# EXPLAIN SELECT *, j->>'foo' FROM jtest.t1 WHERE j->>'foo' = 'test'; QUERY PLAN -------------------------------------------------------------------- Index Scan using idxjson1 on t1 (cost=0.13..8.15 rows=1 width=36) Index Cond: ((j ->> 'foo'::text) = 'test'::text) (2 rows) postgres=# SELECT *, j->>'foo' FROM jtest.t1 WHERE j->>'foo' = 'test'; id | j | ?column? ----+-----------------+---------- 1 | {"foo": "test"} | test (1 row) postgres=# postgres=# CREATE TABLE jtest.t2 ( postgres(# id SERIAL, postgres(# j jsonb DEFAULT NULL postgres(# ); CREATE TABLE postgres=# INSERT INTO jtest.t2(j) SELECT j::jsonb FROM jtest.t1; INSERT 0 1 postgres=# SELECT *, jsonb_extract_path(j, 'foo') FROM jtest.t2; id | j | jsonb_extract_path ----+-----------------+-------------------- 1 | {"foo": "test"} | "test" (1 row) postgres=# SELECT *, j->>'foo' FROM jtest.t2; id | j | ?column? ----+-----------------+---------- 1 | {"foo": "test"} | test (1 row) postgres=# SELECT *, j->>'foo' FROM jtest.t2 WHERE j->>'foo' = 'test'; id | j | ?column? ----+-----------------+---------- 1 | {"foo": "test"} | test (1 row) postgres=# SELECT *, j->>'foo' FROM jtest.t2 WHERE j @> '{"foo": "test"}'; id | j | ?column? ----+-----------------+---------- 1 | {"foo": "test"} | test (1 row) postgres=# EXPLAIN SELECT *, j->>'foo' FROM jtest.t2 WHERE j @> '{"foo": "test"}'; QUERY PLAN ----------------------------------------------------------------------- Seq Scan on t2 (cost=10000000000.00..10000000025.38 rows=1 width=36) Filter: (j @> '{"foo": "test"}'::jsonb) (2 rows) postgres=# CREATE INDEX idxjsonops ON jtest.t2 USING gin(j jsonb_path_ops); CREATE INDEX postgres=# EXPLAIN SELECT *, j->>'foo' FROM jtest.t2 WHERE j @> '{"foo": "test"}'; QUERY PLAN ------------------------------------------------------------------------- Bitmap Heap Scan on t2 (cost=8.00..12.02 rows=1 width=36) Recheck Cond: (j @> '{"foo": "test"}'::jsonb) -> Bitmap Index Scan on idxjsonops (cost=0.00..8.00 rows=1 width=0) Index Cond: (j @> '{"foo": "test"}'::jsonb) (4 rows) postgres=# SELECT *, j->>'foo' FROM jtest.t2 WHERE j @> '{"foo": "test"}'; id | j | ?column? ----+-----------------+---------- 1 | {"foo": "test"} | test (1 row) postgres=# SET enable_seqscan = ON; SET postgres=# DROP SCHEMA jtest CASCADE; NOTICE: drop cascades to 2 other objects DETAIL: drop cascades to table jtest.t1 drop cascades to table jtest.t2 DROP SCHEMA
[7 Oct 2015 9:41]
Knut Anders Hatlen
Looks like we were a bit too quick to close this bug. More or less the same problem was later reported in Bug #78464, and that bug has a patch in the pipeline, which should also fix the problem reported here. The planned fix will disable the use of the index in the query, so that it returns the same results regardless of the presence of the index. If using an index is desired, one could use JSON_UNQUOTE in the generated column, as suggested in an earlier comment.