| Bug #80556 | inserting large json data takes an inordinate amount of time | ||
|---|---|---|---|
| Submitted: | 29 Feb 2016 12:05 | Modified: | 13 Jun 2016 16:39 |
| Reporter: | Shane Bester (Platinum Quality Contributor) | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
| Version: | 5.7.12 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[29 Feb 2016 12:27]
MySQL Verification Team
5.7.12 perf report for the slow testcase
Attachment: bug80556_5.7.12_perf_report.zip (application/x-zip-compressed, text), 197.08 KiB.
[13 Jun 2016 16:39]
Jon Stephens
General optimizer bug, not limited to JSON.
Documented in the MySQL 5.7.14 changelog as follows:
A flaw in the allocation of memory for appending string
values caused some operations to be performed slowly, including
creation of tables having generated columns and inserts of large
values into JSON columns.
Closed.

Description: Inserting large data into a json field takes an excessive amount of time. mysql> select json_type(@a),length(@a),length(@a)*3; +---------------+------------+--------------+ | json_type(@a) | length(@a) | length(@a)*3 | +---------------+------------+--------------+ | ARRAY | 5600005 | 16800015 | +---------------+------------+--------------+ 1 row in set (0.65 sec) mysql> insert into t(a) values(@a); Query OK, 1 row affected (7 min 20.87 sec) ---------- If running "set names latin1" before insert, it takes < 1 second. How to repeat: #compare each of these inserts. set global max_allowed_packet=1024*1024*16; \r set names utf8; drop table if exists t; create table t(a json)engine=innodb; set @a:=concat('[',repeat('"a",',1500000),'"a"]'); select json_type(@a),length(@a),length(@a)*3; insert into t(a) values(@a); \r set names latin1; drop table if exists t; create table t(a json)engine=innodb; set @a:=concat('[',repeat('"a",',1500000),'"a"]'); select json_type(@a),length(@a),length(@a)*3; insert into t(a) values(@a); Suggested fix: The code in json_binary.cc already has comments hinting at the fix? /* If the array was too large to fit in the small storage format, reset the destination buffer and retry with the large storage format. Possible future optimization: Analyze size up front and pick the correct format on the first attempt, so that we don't have to redo parts of the serialization. */