Bug #80712 | Invalid data for ST_GeomFromGeoJSON() is hard to diagnose | ||
---|---|---|---|
Submitted: | 12 Mar 2016 11:17 | Modified: | 31 May 2016 15:46 |
Reporter: | Daniël van Eeden (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: GIS | Severity: | S3 (Non-critical) |
Version: | 5.7.11 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | geo, geojson, gis, json, spatial |
[12 Mar 2016 11:17]
Daniël van Eeden
[12 Mar 2016 11:20]
Daniël van Eeden
Related: https://gist.github.com/dveeden/c5b9fbf56e4559d9f400 https://www.percona.com/blog/2016/03/07/json-document-fast-lookup-with-mysql-5-7/ http://databaseblog.myname.nl/2016/03/re-json-document-fast-lookup-with-mysql.html
[12 Mar 2016 11:57]
Daniël van Eeden
This works: SELECT ST_geomFromGeoJSON(feature->'$.geometry',2) FROM test_features WHERE json_type(feature->'$.geometry')<>'NULL' The invalid data is for rows like this where $.geometry is null: mysql> SELECT feature, feature->'$.geometry' FROM test_features WHERE id=45504\G *************************** 1. row *************************** feature: {"type": "Feature", "geometry": null, "properties": {"TO_ST": null, "BLKLOT": "1245061", "STREET": null, "FROM_ST": null, "LOT_NUM": "061", "ST_TYPE": null, "ODD_EVEN": null, "BLOCK_NUM": "1245", "MAPBLKLOT": "1245061"}} feature->'$.geometry': null 1 row in set (0.00 sec)
[12 Mar 2016 11:58]
Daniël van Eeden
Adding a ST_IsValidGeoJSON() function would be nice. Also it would be nice to handle null/NULL values differently to make this work for virtual columns
[15 Mar 2016 6:56]
MySQL Verification Team
Hello Daniël, Thank you for the report. Thanks, Umesh
[15 Mar 2016 6:57]
MySQL Verification Team
-- -- 5.7.11 Database changed mysql> CREATE TABLE `test_features` ( -> `id` int(11) NOT NULL AUTO_INCREMENT, -> `feature` json NOT NULL, -> PRIMARY KEY (`id`) -> ) ENGINE=InnoDB; ERROR 1050 (42S01): Table 'test_features' already exists mysql> drop table test_features; Query OK, 0 rows affected (0.01 sec) mysql> CREATE TABLE `test_features` ( -> `id` int(11) NOT NULL AUTO_INCREMENT, -> `feature` json NOT NULL, -> PRIMARY KEY (`id`) -> ) ENGINE=InnoDB; Query OK, 0 rows affected (0.01 sec) - Insert mysql> INSERT INTO test.test_features(feature) VALUES('{ "type": "Feature", "geometry": { "type": "Polygon", "coordinates": [ [ [ -122.41983177253881, 37.80720512387136, 0 ], [ -122.41983177253881, 37.80720512387136, 0 ] ] ] }, "properties": { "TO_ST": "600", "BLKLOT": "0010001", "STREET": "BEACH", "FROM_ST": "600", "LOT_NUM": "001", "ST_TYPE": "ST", "ODD_EVEN": "E", "BLOCK_NUM": "0010", "MAPBLKLOT": "0010001" } }') mysql> ALTER TABLE test_features ADD COLUMN `geometry` geometry GENERATED ALWAYS AS (ST_GeomFromGeoJSON(feature->'$.geometry',2)) VIRTUAL; Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> ALTER TABLE test_features ADD COLUMN `street` varchar(30) GENERATED ALWAYS AS (json_unquote(json_extract(`feature`,'$.properties.STREET'))) VIRTUAL; Query OK, 0 rows affected (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> show create table test_features\G *************************** 1. row *************************** Table: test_features Create Table: CREATE TABLE `test_features` ( `id` int(11) NOT NULL AUTO_INCREMENT, `feature` json NOT NULL, `geometry` geometry GENERATED ALWAYS AS (st_geomfromgeojson(json_extract(`feature`,'$.geometry'),2)) VIRTUAL, `street` varchar(30) GENERATED ALWAYS AS (json_unquote(json_extract(`feature`,'$.properties.STREET'))) VIRTUAL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql> SELECT COUNT(`geometry`) FROM test_features; ERROR 3072 (HY000): Invalid GeoJSON data provided to function st_geomfromgeojson mysql> SELECT LENGTH(`geometry`) FROM test_features; ERROR 3072 (HY000): Invalid GeoJSON data provided to function st_geomfromgeojson mysql>
[22 Mar 2016 9:12]
Erik Frøseth
Posted by developer: Here is a small analysis of the problem: The GeoJSON standard specifies the following: "A complete GeoJSON data structure is always an object (in JSON terms).". That means that if the input to ST_GeomFromGeoJSON is NOT a JSON object, it is considered illegal. And as with almost all other MySQL functions, if any of the input arguments to ST_GeomFromGeoJSON is SQL NULL, we return SQL NULL. The use case reported can be boiled down to the following statements for simplicity: CREATE TABLE t1 (col1 JSON); INSERT INTO t1 VALUES ('{ "type": "Feature", "geometry": null, "properties": {} }'); ALTER TABLE t1 ADD COLUMN `col2` geometry GENERATED ALWAYS AS (ST_GeomFromGeoJSON(col1->'$.geometry')) VIRTUAL; SELECT * FROM t1; The last SELECT fails, because when the virtual column 'col2' tries to extract the property 'geometry' from the JSON field, it gets a JSON NULL and not a SQL NULL. Hence, it is considered as an illegal/invalid GeoJSON input. The fix for this is rather simple; Treat JSON NULL as a SQL NULL in ST_GeomFromGeoJSON. The consequence of this fix is that the following SQL becomes valid: SELECT ST_GeomFromGeoJSON(CAST('null' AS JSON)); I do not see any issues with allowing the above statement, as the use case given by the reported seems like a rather common use case. So I suggest that we handle JSON NULL as a SQL NULL in ST_GeomFromGeoJSON. I'm not sure though whether this should be considered as a bug or feature request...
[31 May 2016 15:46]
Paul DuBois
Posted by developer: Noted in 5.7.14 changelog. ST_GeomFromGeoJSON() treated JSON NULL as invalid input. Now it treats JSON NULL as SQL NULL and thus returns SQL NULL for JSON NULL input.