Bug #96466 .ibd file grows indefinitely for tables with blob columns
Submitted: 8 Aug 2019 10:30 Modified: 31 Jan 12:51
Reporter: Deepak Dhage Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:8.0.17 OS:CentOS
Assigned to: CPU Architecture:Any
Tags: ibd, memory leak, Tablespace

[8 Aug 2019 10:30] Deepak Dhage
Description:
If a table has a column with blob type (blob, mediumblob, longblob), and we keep updating a row with different values (having different size) for blob column, the memory allocated for old data is never released. With each update, size of ibd file keeps growing.

How to repeat:
Settings - 
innodb_file_per_table=ON

Setup - 

CREATE DATABASE sample;

USE sample;

CREATE TABLE `my_sample_table` (`id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(128) NOT NULL, `some_data` mediumblob, `update_time` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`), UNIQUE KEY `name` (`name`)) ENGINE=InnoDB DEFAULT CHARSET=utf8

INSERT INTO my_sample_table (id, name, some_data) VALUES (1,'data1', load_file('/var/lib/mysql-files/file2.dat'));

-- Run below queries from shell in for loop (for say, 100 times or more to see memory growing)
-- for starts (i = 1 to 100)
UPDATE my_sample_table SET some_data = load_file('/var/lib/mysql-files/file1.dat') WHERE id = 1;
UPDATE my_sample_table SET some_data = load_file('/var/lib/mysql-files/file2.dat') WHERE id = 1;
-- for ends

File1.dat  and file2.dat should be of considerably different sizes (say, for example, file1 is of size 1MB and file2, 10MB).

Keep a check on this file for size - /var/lib/mysql/sample/my_sample_table.ibd

Size of this ibd file keeps growing on MySQL 8.x version as we keep running update queries.

The MySQL versions used are - 8.0.13 and 8.0.17.
Behaviour on old version - On MySQL 5.6.44, the size of ibd file remains constant after running above queries 3 to 4 times. No matter how many times we run queries again, the size of ibd file never grows beyond this size.

Suggested fix:
OPTIMIZE TABLE query does release unused space but we can't keep doing this on production. There is no fix identified so far.
[8 Aug 2019 12:34] Sinisa Milivojevic
Hi Mr. Dhage,

Thank you for your bug report.

Your report is quite unclear whether you report too much memory being used or too much disk space being used.  I will assume the latter.

What you are experiencing is exactly how InnoDB storage engine is designed. In choosing between the speed and disk space usage minimisation, the creators opted for the speed. Growing of the .ibd files , or any other InnoDB tablespaces is explained in our Reference Manual, chapter on the InnoDB storage engine.

That is one of the reason why OPTIMIZE TABLE has been added on .... Whether you want to use it or not, it is your choice.

Not a bug.
[8 Aug 2019 12:58] Sinisa Milivojevic
Hi,

Actually, I would like to test your report.

Can you provide me with those two files. You can use "Files" tab.

There were two internal bugs with similar reports that were fixed in 8.0.16.
[8 Aug 2019 15:07] Deepak Dhage
Thanks Sinisa for your prompt response and my sincere apologies, if I was not clear enough while reporting the bug.
To answer your first question, its disk space which is of concern here. The size of ibd file (/var/lib/mysql/sample/my_sample_table.ibd) is eating up too much space. 
We are in process of upgrading our database to version 8 from old version 5.6. So our master node will run on MySQL 5.6 and slave will run on 8 for some time. We will gradually upgrade all the nodes.
We are getting disk space full alert every month on slave running on MySQL 8 whereas we don't get this on it master node which runs MySQL 5.6. The ibd file is growing a lot for tables having columns with type mediumblob or mediumtext.

I do agree that OPTIMIZE TABLE should be run to defrag the storage and claim the unused disk space but running this on production every month for selected tables is a real concern. Moreover this did not happen on master node running MySQL 5.6 where this table exists for more than 2 years now.

Regarding files I have used in my analysis, those were just any random files I have picked from my test server. You can pick any files from your machine with text data so that you can select and see if those are updated in DB properly. With file of 1 MB and 4 MB, ibd file grows till 12 to 13 MB and stops growing further if I keep updating the blob column with same files again (tried 5000 iterations) whereas on MySQL 8 node for same test files size grown to 21 GB for 5000 iterations. This doesn't look like intentionally designed for optimising for speed over disk space.

I hope, this will help you investigate this further. Let me know, if you need any additional information, I would be happy to provide.
[8 Aug 2019 15:13] Deepak Dhage
Here are the details of ibd files after running above test -

MySQL 5.6 - 
bash-4.2# ls -lrt /var/lib/mysql/sample/my_sample_table.ibd
-rw-rw---- 1 mysql mysql 12582912 Aug  6 22:26 /var/lib/mysql/sample/my_sample_table.ibd

MySQL 8.0.17
[root@host1234 deepak.dhage]# ls -lrt /var/lib/mysql/sample/my_sample_table.ibd
-rw-r----- 1 mysql mysql 21361590272 Aug  6 22:55 /var/lib/mysql/sample/my_sample_table.ibd
[9 Aug 2019 13:33] Sinisa Milivojevic
Hi,

I have not used load_file(), but instead repeat('a',1000000) and repeat('b',4000000) ........

After 200 UPDATE's the size was 14680064 bytes.

After 400 UPDATE's the size was 14680064 bytes.

I then used load_file() and it was the same.

I used mysql-8.0.17.

Can't repeat ......
[11 Aug 2019 16:58] Deepak Dhage
Uploading the shell script I am using to test.

Attachment: mysql.sh (text/x-sh), 476 bytes.

[11 Aug 2019 17:02] Deepak Dhage
Thanks for looking into it, Sinisa!
I tried with repeat() function as well and I could see the same behaviour which you mentioned, but when I tried with load_file() function, I could again see ibd size growing continuously. I have uploaded the shell script which I am using to test. Can you give it a try? I will also upload the files which I am uploading into blob column.
[11 Aug 2019 17:04] Deepak Dhage
Attaching the files I am uploading to blob column using load_file()

Attachment: LoadTheseFiles.zip (application/zip, text), 32.90 KiB.

[11 Aug 2019 17:10] Deepak Dhage
Just to give you idea around data we are storing in blob column in production where we see this behaviour, it's a bloom filter in binary format and the query we are running is INSERT with ON DUPLICATE UPDATE. Every time bloom filter data is updated, size of ibd file grows.
[12 Aug 2019 12:31] Sinisa Milivojevic
Hi,

No need to upload anything.

I have tried load_file() function and I do not get file size increase.

This can be caused by several factors, none of which is a bug. First of all, you might have upgraded to 8.0 without full dump / restore. Or you could have done that, but failed to change the row_format of InnoDB.

Another possibility is a known phenomena, that  if there is any long-running repeatable read transaction holding up the purge thread, ibd file will continue to increase.

This is all expected behaviour.

Hence, no need to upload anything.
[12 Aug 2019 14:35] Deepak Dhage
Hi Sinisa,

I appreciate you spending time in investigating this. You are one step close to find the issue. Kindly investigate further or escalate it.

The server I am using to test is a stand alone 8.0.17 instance. There is no production traffic, no other reads or writes which can interfere with transactions I am making.

After your last comment yesterday, I tried with repeat function, as well as creating random string of 1MB and 4MB every time uploading the data and I could see the results similar to your observation. IBD file doesn't grow indefinitely. But very strangely I could see it happen with files I have used. So I thought, it could be related to data I am uploading and hence I have sent you the files to test. 
For the random string test, file grows indefinitely on 8.0.13 but doesn't grow in similar way on 8.0.17. So as you said the issue is fixed in 8.0.17 but its still reproducible for a particular data set. I would say the issue still persist or its partially fixed.

If you don't want to use the files I have uploaded, you can try with one file with random string (use the one you have created earlier) and one blank file and try with load_file() you will see the ibd file growing. 

Hoping to hear from you soon.
[12 Aug 2019 14:38] Deepak Dhage
Just to clarify on your comment related to upgrade, Its a fresh 8.0.17 DB setup. I am not uploading or sourcing any dump. I am just creating a sample database and a new table with blob column. That's it. This is my test setup.
[13 Aug 2019 13:25] Sinisa Milivojevic
Hello Mr. Dhage,

Thank you for a fantastic test case .....

Wow ........ my disk almost overflowed, while show table status showed practically the same size ...

I am increasing the severity for this bug ......

Verified as reported.
[31 Jan 12:51] Erlend Dahl
Fixed in the 8.0.19 release, and here's the changelog entry:
 
Update operations that moved externally stored LOB data to inline storage
failed to mark the old LOB data as purgeable.
[31 Jan 13:12] Sinisa Milivojevic
Thank you, Erlend.