Bug #47623 Suspected memory/temporary file leak
Submitted: 24 Sep 2009 16:20 Modified: 21 Apr 2010 6:45
Reporter: Anna Wiejak Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:5.0.44, 5.0.85 OS:Linux (Centos)
Assigned to: CPU Architecture:Any
Tags: Binlog_cache_disk_use, binlog_cache_size, blobs, Created_tmp_files, innodb, memory leak

[24 Sep 2009 16:20] Anna Wiejak
Description:
A problem description:

* a memory leak in mysql database server (heavily loaded, innodb engine, lots of concurrent transactions using blobs of variable length)

Details:

* memory usage on a dedicated mysql server gradually increases. Virtual memory size of mysqld process is growing above expected values, gigabytes more then calculated from mysqld memory settings.
* number of files opened by mysqld increases with time, the same as disk usage reported by df command. This is cleared by a mysqld restart.
* lots of files kept opened by mysqld are temporary files:

[CUT]
mysqld  30467 mysql  178u   REG                8,1    3445673    443307 /tmp/MLseJHou (deleted)
mysqld  30467 mysql  179u   REG                8,1    3447149    443303 /tmp/MLjeDdlv (deleted)
[CUT]

* the size of biggest temporary files is very simmilar

* the sum of the file sizes calculated from an lsof output is roughly the size of missing memory.

* from initial investigation it is suspected that big blob transactions are the culprit. The binlog_cache_size is left default (32KB), so it is assumed any transaction bigger then that would end up in a temporary file in tmpdir. 

* the Created_tmp_files increases with time, it seems in line with the number of deleted /tmp/* files reported by lsof (the value is _not_ the same). The Binlog_cache_disk_use increases with time as well.

How to repeat:

We do not have neither a way of reproducing the problem yet nor a workaround.

Suggested fix:
The problem is still being investigated.
[24 Sep 2009 16:33] Valeriy Kravchuk
Thank you for the problem report. Please, send two sets of results of the following commands:

show global status;
show innodb status\G

with enough time between sets to have a notable memory leak. Send your my.cnf file content also and the results of:

free

Linux command.
[28 Sep 2009 17:32] Anna Wiejak
Hi

I managed to isolate the leak in a test database.
The leak is associated with transactions running big blobs updates and it confirms my earlier observations related to "hidden" files created in the process.

I have already attached the requested details. Would you need anything else?
 
I will still work on this problem, but there are some time constraints and I am not able to sacrifice 100% of my time. Also I thought that what can take me many hours to find, would probably be much easier for people who understand the code (and mysql memory management issues) already. 

Thanks!

Anna
[28 Sep 2009 20:09] Anna Wiejak
I think the answer to this "memory leak" is here in the documentation:

"Within an uncommitted transaction, all updates (UPDATE, DELETE, or INSERT) that change transactional tables such as BDB or InnoDB  tables are cached until a COMMIT  statement is received by the server. At that point, mysqld writes the entire transaction to the binary log before the COMMIT is executed. When the thread that handles the transaction starts, it allocates a buffer of binlog_cache_size to buffer statements. If a statement is bigger than this, the thread opens a temporary file to store the transaction. The temporary file is deleted when the thread ends."

The problem is, that when persistent connections to the database are used, the threads and temporary files never get deleted. This can easily lead to a difficult to troubleshoot problem - especially when the blob sizes vary and the 
maximum connection pool is high.

Regards,

Anna
[29 Sep 2009 12:55] Anna Wiejak
Hi

I can workaround a problem now, but I still claim it's a bug. Can someone please look into it and confirm or tell me how wrong I am?

My test case updating a single blob (3MB) once per each thread (2000 threads), shows that even when a thread is taken down, the file is released (as in the documentation, quoted above), but memory is not freed.

What I don't understand is why mysql seems to cache whole temporary file in memory forever. Is this expected?? 

Regards,

Anna
[1 Oct 2009 16:14] Valeriy Kravchuk
Let me make one blind guess... I see you have table_cache=4200. Can you, please, try to make it smaller, say, 200, and check if you'll see the same memory leak with 2000 threads?

I feel this is a duplicate of bug #38002.
[2 Oct 2009 12:44] Anna Wiejak
It's not the same bug I am afraid.

1) Lowering the value of table_cache even to 10 ;) doesn't help 
2) I've tried to flush tables even before raising a bug here - this doesn't help to free memory as well
[16 Oct 2009 10:39] Andrii Nikitin
Anna,

What happens if you repeat the same test multiple times? This may be a leak if memory is continuously growing after each execution of the same test.

If memory grew up after first test and is reused in future steps (i.e. isn't growing much anymore)-> this is not a leak, just probably "too high memory usage".

If you experience "too high memory usage" with multiple open tables and blobs fields in 5.0 this is definitely bug #38002 .

Also FLUSH TABLES; command should release memory, but please note that you may not be shown lower memory usage by process immediately after memory is freed.
[20 Nov 2009 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[20 Nov 2009 0:49] MySQL Verification Team
Still needs feedback.
[21 Dec 2009 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[21 Apr 2010 6:45] Sveta Smirnova
Could repeat it finally.

But according to test results this looks like duplicate of bug #15089. So closing it as such.