Description:
JSON partial updates using (JSON_SET, JSON_REPLACE, JSON_REMOVE) and other similar functions prevent the free space from being released back to disk.
Direct assignments needs to be do like below in order to release the space after optimize command.
Direct assignments needs to be do like below in order to release the space after optimize command.
How to repeat:
1. Creating a JSON table and pushing some data. The same data can be verified via the hex dump as well.
mysql> CREATE TABLE test01 (a JSON);
Query OK, 0 rows affected (0.01 sec)
mysql> insert into test01 (a) values ( json_object('a',repeat('a',1*1024*1024)) );
Query OK, 1 row affected (0.07 sec)
mysql> insert into test01 (a) values ( json_object('a',repeat('a',1*1024*1024)) );
Query OK, 1 row affected (0.04 sec)
hexdump -C test01.ibd
0028bff0 61 61 61 61 61 61 61 61 38 1b ed 99 20 91 fa 20 |aaaaaaaa8... .. |
0028c000 98 73 6d 50 00 00 00 a3 00 00 00 00 ff ff ff ff |.smP............|
0028c010 00 00 00 09 20 91 fa 20 00 17 00 00 00 00 00 00 |.... .. ........|
0028c020 00 00 00 00 02 30 00 00 00 10 df 00 00 00 01 0b |.....0..........|
0028c030 0e 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 |.aaaaaaaaaaaaaaa|
0028c040 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 |aaaaaaaaaaaaaaaa|
*
0028d110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
0028fff0 00 00 00 00 00 00 00 00 98 73 6d 50 20 91 fa 20 |.........smP .. |
00290000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
mysql> select json_storage_size(a), json_storage_free(a) from test01;
+----------------------+----------------------+
| json_storage_size(a) | json_storage_free(a) |
+----------------------+----------------------+
| 1048600 | 0 |
| 1048600 | 0 |
+----------------------+----------------------+
2 rows in set (0.01 sec)
2. Now, we will do a partial update using the JSON_REPLACE function. In the hex dump, we can see the previous details with the pattern “a“ still appearing and not cleaned yet.
mysql> UPDATE test01 SET a = JSON_REPLACE(a, '$.a', 'b');
Query OK, 2 rows affected (0.03 sec)
Rows matched: 2 Changed: 2 Warnings: 0
mysql> select * from test01;
+------------+
| a |
+------------+
| {"a": "b"} |
| {"a": "b"} |
+------------+
2 rows in set (0.01 sec)
hexdump -C test01.ibd
001c7ff0 61 61 61 61 61 61 61 61 25 c5 15 d3 20 85 98 62 |aaaaaaaa%... ..b|
001c8000 eb 64 b6 0b 00 00 00 72 00 00 00 00 ff ff ff ff |.d.....r........|
001c8010 00 00 00 09 20 85 98 62 00 17 00 00 00 00 00 00 |.... ..b........|
001c8020 00 00 00 00 02 30 00 00 00 3f c7 00 00 00 01 0b |.....0...?......|
001cbff0 61 61 61 61 61 61 61 61 eb 64 b6 0b 20 85 98 62 |aaaaaaaa.d.. ..b|
001cc010 00 00 00 09 20 85 98 62 00 17 00 00 00 00 00 00 |.... ..b........|
001cc020 00 00 00 00 02 30 00 00 00 3f c7 00 00 00 01 0b |.....0...?......|
mysql> select json_storage_size(a), json_storage_free(a) from test01;
+----------------------+----------------------+
| json_storage_size(a) | json_storage_free(a) |
+----------------------+----------------------+
| 1048600 | 1048577 |
| 1048600 | 1048577 |
+----------------------+----------------------+
2 rows in set (0.00 sec)
3. Now, we will try to reclaim the space by running the optimize command. Ideally, after releasing the space, it would be in a few KBs, but we can still see it shows ~5 MB from ~10 MB.
ls -lh test01.ibd
-rw-r----- 1 aniljoshi admin 10M Sep 13 16:58 test01.ibd
mysql> optimize table test01;
+-------------+----------+----------+-------------------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------------+----------+----------+-------------------------------------------------------------------+
| test.test01 | optimize | note | Table does not support optimize, doing recreate + analyze instead |
| test.test01 | optimize | status | OK |
+-------------+----------+----------+-------------------------------------------------------------------+
2 rows in set (0.08 sec)
test % ls -lh test01.ibd
-rw-r----- 1 aniljoshi admin 5.0M Sep 13 17:01 test01.ibd
4. Now, after we run a direct update query which introduces new field, the space is released fully. It showing ~112 KB, which should be the correct behaviour.
mysql> update test01 set a = '{"a": "b", "c":"d"}';
Query OK, 2 rows affected (0.03 sec)
Rows matched: 2 Changed: 2 Warnings: 0
mysql> select * from test01
-> ;
+----------------------+
| a |
+----------------------+
| {"a": "b", "c": "d"} |
| {"a": "b", "c": "d"} |
+----------------------+
2 rows in set (0.00 sec)
ls -lh test01.ibd
-rw-r----- 1 aniljoshi admin 112K Sep 13 17:12 test01.ibd
mysql> select json_storage_size(a), json_storage_free(a) from test01;
+----------------------+----------------------+
| json_storage_size(a) | json_storage_free(a) |
+----------------------+----------------------+
| 25 | 0 |
| 25 | 0 |
+----------------------+----------------------+
2 rows in set (0.00 sec)
Suggested fix:
The point is that it should be better to have some linkage in the optimise table or similar commands to release the space without any such dual efforts, which also seems a bit impractical and unsafe.