Bug #69798 OGC Well-Known-Binary values have an invalid prefix of four zero bytes
Submitted: 19 Jul 2013 22:02 Modified: 17 Sep 2013 13:35
Reporter: Craig Bruce Email Updates:
Status: Not a Bug Impact on me:
Category:MySQL Server: GIS Severity:S3 (Non-critical)
Version:5.5.23, 5.6.13 OS:Any
Assigned to: CPU Architecture:Any
Tags: gis, ogc, wkb
Triage: Needs Triage: D4 (Minor)

[19 Jul 2013 22:02] Craig Bruce
Using MySQL Ver 14.14 Distrib 5.5.23, for Linux (x86_64), if I retrieve a properly functioning polygon geometry in hexadecimal with:

mysql> select hex(footprint) from IMAGES limit 1;

I get:


This is *not* a valid OGC Well-Known-Binary value since it is preceded by four zero bytes.  The first byte of the proper WKB value is the 01 byte, indicating little-endian, which is followed by uint32 value 3, indicating a Polygon.  I get the same behaviour if I retrieve WKB values through the C Connector interface and MySQL insists that I add these four leading zeros to valid WKB values when inserting new records.  WKB is defined in OGC Simple Features document 06-103r3:


Why are these extra four bytes here?  Is this some bug relating to the stored blob starting with a 32-bit length value?

How to repeat:
select hex(geometry_column), or retrieve WKB value through C Connector.

Suggested fix:
Remove the invalid four-byte prefix from reported WKB values.  Or, if this behaviour is intended, then it should be explained that the WKB value always has a four-byte prefix in MySQL.
[23 Jul 2013 19:33] Sveta Smirnova
Thank you for the report.

At the very first page of the standard you quote there is a text:

This standard establishes a common architecture and defines terms to use within the architecture.
This standard does not attempt to standardize and does not depend upon any part of the mechanism by which
Types are added and maintained, including the following:
a) syntax and functionality provided for defining types;
b) syntax and functionality provided for defining functions;
c) physical storage of type instances in the database;

Why do you think that MySQL breaks the standard while the standard does not say it must follow storage guidelines, described there?
[23 Jul 2013 22:22] Craig Bruce
"Why do you think that MySQL breaks the standard while the standard does not say it must follow storage guidelines, described there?"

It's an arbitrary deviation from a byte-for-byte definition in the OGC specification for no apparent reason.  Also, the four superfluous bytes contradict the description given directly in the MySQL documentation:


which says, "For example, a WKB value that corresponds to POINT(1 1) consists of this sequence of 21 bytes (each represented here by two hex digits):


In reality, MySQL will return this:


This is arbitrarily and confusingly different from both the OGC WKB definition and the MySQL documentation.  Sure sounds like a bug to me!  You need to fix either the documentation or the implementation.
[24 Jul 2013 18:19] Sveta Smirnova
Thank you for the feedback.

Makes sense. Verified as described:

mysql> select hex( POINT(1, 1));
| hex( POINT(1, 1))                                  |
| 000000000101000000000000000000F03F000000000000F03F |
1 row in set (0.02 sec)
[17 Sep 2013 13:35] Erlend Dahl
[9 Sep 2013 21:05] David Zhao

This is not a bug, GeometryFromText/WKB is designed to return a byte string
of this format: 4 bytes(SRID) prepending the WKB encoded byte string of the
geometry object, other parts of MySQL expects such a geometry object
encoding. Consequently, select hex(ST_GeomFromText('POINT(1 1)')); will
return this result:

mysql> select hex(ST_GeomFromText('POINT(1 1)'));
| hex(ST_GeomFromText('POINT(1 1)'))                 |
| 000000000101000000000000000000F03F000000000000F03F |
1 row in set (0.00 sec)

If user wants the wkb encoded bytes of a geometry object, use AsWKB or
ST_AsWKB function instead, e.g. select hex(st_aswkb(ST_GeomFromText('POINT(1
1)'))); will return the expected WKB encoding of the point:

mysql> select hex(st_aswkb(ST_GeomFromText('POINT(1 1)')));
| hex(st_aswkb(ST_GeomFromText('POINT(1 1)'))) |
| 0101000000000000000000F03F000000000000F03F   |
1 row in set (0.00 sec)