Bug #96466 .ibd file grows indefinitely for tables with blob columns
Submitted: 8 Aug 2019 10:30 Modified: 31 Jan 2020 12:51
Reporter: Deepak Dhage Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:8.0.17 OS:CentOS
Assigned to: CPU Architecture:Any
Tags: ibd, memory leak, Tablespace

[8 Aug 2019 10:30] Deepak Dhage
Description:
If a table has a column with blob type (blob, mediumblob, longblob), and we keep updating a row with different values (having different size) for blob column, the memory allocated for old data is never released. With each update, size of ibd file keeps growing.

How to repeat:
Settings - 
innodb_file_per_table=ON

Setup - 

CREATE DATABASE sample;

USE sample;

CREATE TABLE `my_sample_table` (`id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(128) NOT NULL, `some_data` mediumblob, `update_time` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`), UNIQUE KEY `name` (`name`)) ENGINE=InnoDB DEFAULT CHARSET=utf8

INSERT INTO my_sample_table (id, name, some_data) VALUES (1,'data1', load_file('/var/lib/mysql-files/file2.dat'));

-- Run below queries from shell in for loop (for say, 100 times or more to see memory growing)
-- for starts (i = 1 to 100)
UPDATE my_sample_table SET some_data = load_file('/var/lib/mysql-files/file1.dat') WHERE id = 1;
UPDATE my_sample_table SET some_data = load_file('/var/lib/mysql-files/file2.dat') WHERE id = 1;
-- for ends

File1.dat  and file2.dat should be of considerably different sizes (say, for example, file1 is of size 1MB and file2, 10MB).

Keep a check on this file for size - /var/lib/mysql/sample/my_sample_table.ibd

Size of this ibd file keeps growing on MySQL 8.x version as we keep running update queries.

The MySQL versions used are - 8.0.13 and 8.0.17.
Behaviour on old version - On MySQL 5.6.44, the size of ibd file remains constant after running above queries 3 to 4 times. No matter how many times we run queries again, the size of ibd file never grows beyond this size.

Suggested fix:
OPTIMIZE TABLE query does release unused space but we can't keep doing this on production. There is no fix identified so far.
[8 Aug 2019 12:34] MySQL Verification Team
Hi Mr. Dhage,

Thank you for your bug report.

Your report is quite unclear whether you report too much memory being used or too much disk space being used.  I will assume the latter.

What you are experiencing is exactly how InnoDB storage engine is designed. In choosing between the speed and disk space usage minimisation, the creators opted for the speed. Growing of the .ibd files , or any other InnoDB tablespaces is explained in our Reference Manual, chapter on the InnoDB storage engine.

That is one of the reason why OPTIMIZE TABLE has been added on .... Whether you want to use it or not, it is your choice.

Not a bug.
[8 Aug 2019 12:42] MySQL Verification Team
related fixes in 8.0.16 where the purge was buggy.
 
Bug 28510599 - THE MYSQL.IBD FILE SIZE IS CONSTANTLY GROWING.
Bug 29649191 - 8.0 : SINGLE INNODB TABLESPACE GROWS CONTINUOUSLY WITH ONLY TWO ROWS IN TABLE
[8 Aug 2019 12:58] MySQL Verification Team
Hi,

Actually, I would like to test your report.

Can you provide me with those two files. You can use "Files" tab.

There were two internal bugs with similar reports that were fixed in 8.0.16.
[8 Aug 2019 15:07] Deepak Dhage
Thanks Sinisa for your prompt response and my sincere apologies, if I was not clear enough while reporting the bug.
To answer your first question, its disk space which is of concern here. The size of ibd file (/var/lib/mysql/sample/my_sample_table.ibd) is eating up too much space. 
We are in process of upgrading our database to version 8 from old version 5.6. So our master node will run on MySQL 5.6 and slave will run on 8 for some time. We will gradually upgrade all the nodes.
We are getting disk space full alert every month on slave running on MySQL 8 whereas we don't get this on it master node which runs MySQL 5.6. The ibd file is growing a lot for tables having columns with type mediumblob or mediumtext.

I do agree that OPTIMIZE TABLE should be run to defrag the storage and claim the unused disk space but running this on production every month for selected tables is a real concern. Moreover this did not happen on master node running MySQL 5.6 where this table exists for more than 2 years now.

Regarding files I have used in my analysis, those were just any random files I have picked from my test server. You can pick any files from your machine with text data so that you can select and see if those are updated in DB properly. With file of 1 MB and 4 MB, ibd file grows till 12 to 13 MB and stops growing further if I keep updating the blob column with same files again (tried 5000 iterations) whereas on MySQL 8 node for same test files size grown to 21 GB for 5000 iterations. This doesn't look like intentionally designed for optimising for speed over disk space.

I hope, this will help you investigate this further. Let me know, if you need any additional information, I would be happy to provide.
[8 Aug 2019 15:13] Deepak Dhage
Here are the details of ibd files after running above test -

MySQL 5.6 - 
bash-4.2# ls -lrt /var/lib/mysql/sample/my_sample_table.ibd
-rw-rw---- 1 mysql mysql 12582912 Aug  6 22:26 /var/lib/mysql/sample/my_sample_table.ibd

MySQL 8.0.17
[root@host1234 deepak.dhage]# ls -lrt /var/lib/mysql/sample/my_sample_table.ibd
-rw-r----- 1 mysql mysql 21361590272 Aug  6 22:55 /var/lib/mysql/sample/my_sample_table.ibd
[9 Aug 2019 13:33] MySQL Verification Team
Hi,

I have not used load_file(), but instead repeat('a',1000000) and repeat('b',4000000) ........

After 200 UPDATE's the size was 14680064 bytes.

After 400 UPDATE's the size was 14680064 bytes.

I then used load_file() and it was the same.

I used mysql-8.0.17.

Can't repeat ......
[11 Aug 2019 16:58] Deepak Dhage
Uploading the shell script I am using to test.

Attachment: mysql.sh (text/x-sh), 476 bytes.

[11 Aug 2019 17:02] Deepak Dhage
Thanks for looking into it, Sinisa!
I tried with repeat() function as well and I could see the same behaviour which you mentioned, but when I tried with load_file() function, I could again see ibd size growing continuously. I have uploaded the shell script which I am using to test. Can you give it a try? I will also upload the files which I am uploading into blob column.
[11 Aug 2019 17:04] Deepak Dhage
Attaching the files I am uploading to blob column using load_file()

Attachment: LoadTheseFiles.zip (application/zip, text), 32.90 KiB.

[11 Aug 2019 17:10] Deepak Dhage
Just to give you idea around data we are storing in blob column in production where we see this behaviour, it's a bloom filter in binary format and the query we are running is INSERT with ON DUPLICATE UPDATE. Every time bloom filter data is updated, size of ibd file grows.
[11 Aug 2019 17:26] MySQL Verification Team
didn't bother trying to get load_file working.  but if there is any long-running repeatable read transaction holding up the purge thread, ibd file will continue to increase
[12 Aug 2019 12:31] MySQL Verification Team
Hi,

No need to upload anything.

I have tried load_file() function and I do not get file size increase.

This can be caused by several factors, none of which is a bug. First of all, you might have upgraded to 8.0 without full dump / restore. Or you could have done that, but failed to change the row_format of InnoDB.

Another possibility is a known phenomena, that  if there is any long-running repeatable read transaction holding up the purge thread, ibd file will continue to increase.

This is all expected behaviour.

Hence, no need to upload anything.
[12 Aug 2019 14:35] Deepak Dhage
Hi Sinisa,

I appreciate you spending time in investigating this. You are one step close to find the issue. Kindly investigate further or escalate it.

The server I am using to test is a stand alone 8.0.17 instance. There is no production traffic, no other reads or writes which can interfere with transactions I am making.

After your last comment yesterday, I tried with repeat function, as well as creating random string of 1MB and 4MB every time uploading the data and I could see the results similar to your observation. IBD file doesn't grow indefinitely. But very strangely I could see it happen with files I have used. So I thought, it could be related to data I am uploading and hence I have sent you the files to test. 
For the random string test, file grows indefinitely on 8.0.13 but doesn't grow in similar way on 8.0.17. So as you said the issue is fixed in 8.0.17 but its still reproducible for a particular data set. I would say the issue still persist or its partially fixed.

If you don't want to use the files I have uploaded, you can try with one file with random string (use the one you have created earlier) and one blank file and try with load_file() you will see the ibd file growing. 

Hoping to hear from you soon.
[12 Aug 2019 14:38] Deepak Dhage
Just to clarify on your comment related to upgrade, Its a fresh 8.0.17 DB setup. I am not uploading or sourcing any dump. I am just creating a sample database and a new table with blob column. That's it. This is my test setup.
[12 Aug 2019 14:57] MySQL Verification Team
this seemed to repeat it for me:

drop table if exists t;
create table `t` (`id` int not null auto_increment, `b` longblob, primary key(`id`)) engine=innodb;
set @b:=repeat('b',4*1024*1024);
insert into t(id,b) values(1,@b);
drop procedure if exists p;
delimiter $
create procedure p(p_num int)
begin
  declare v_i int default 0;
  repeat
    update t set b:=repeat('a',floor(1024*1024*4));
    if v_i % 2 = 0 then update t set b:=''; end if;
    show table status like 't';
    select length(b),v_i from t;
    set v_i:=v_i+1;
  until v_i>p_num end repeat;
end $
delimiter ;
call p(5000);
show table status like 't' \G

# ibd gets big.
[13 Aug 2019 13:25] MySQL Verification Team
Hello Mr. Dhage,

Thank you for a fantastic test case .....

Wow ........ my disk almost overflowed, while show table status showed practically the same size ...

I am increasing the severity for this bug ......

Verified as reported.
[31 Jan 2020 12:51] Erlend Dahl
Fixed in the 8.0.19 release, and here's the changelog entry:
 
Update operations that moved externally stored LOB data to inline storage
failed to mark the old LOB data as purgeable.
[31 Jan 2020 13:12] MySQL Verification Team
Thank you, Erlend.
[27 Apr 2021 7:58] Yoseph Phillips
Please back port this to 5.7, it is still an issue on 5.7.34.
[27 Apr 2021 12:35] MySQL Verification Team
Hi,

We could not repeat this on the latest 5.7.
[28 Apr 2021 2:42] Yoseph Phillips
Please let us know what size you see for the ibd file, before and after the update when doing:

DROP DATABASE IF EXISTS defect;

CREATE DATABASE defect;

USE defect;

CREATE TABLE test_table (
  a INT NOT NULL AUTO_INCREMENT,
  b LONGBLOB NOT NULL,
  PRIMARY KEY (a)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPACT;

INSERT INTO test_table (b) VALUES (repeat(md5(1), 100000));
INSERT INTO test_table (b) VALUES (repeat(md5(2), 100000));
INSERT INTO test_table (b) VALUES (repeat(md5(3), 100000));
INSERT INTO test_table (b) VALUES (repeat(md5(4), 100000));

INSERT INTO test_table (b) SELECT t2.b FROM test_table t1, test_table t2, test_table t3, test_table t4;

-- test_table.ibd is now 823,296 KB

UPDATE test_table SET b = repeat(md5(1), 99999);

-- test_table.ibd is now 1,658,880 KB

In recent versions of 5.7 we are seeing that even though we made the data smaller with an UPDATE statement, the ibd file more than doubled in size.

Please back port this fix to 5.7.
[14 Jun 2022 6:33] huahua xu
I do not think it is an issue.

due to the atomicity of transactions and mvcc mechanism, the innodb is unable to immediately free the space for storing old values and has to expand the space to store new values.

If updating one row per transaction, the size of ibd file will not grow.
[14 Jun 2022 12:16] MySQL Verification Team
That is approximately correct. The file would still grow, but at much smaller rate .....