Bug #56358 | My cluster used so many swap even if they have enough memory. | ||
---|---|---|---|
Submitted: | 30 Aug 2010 9:40 | Modified: | 2 May 2012 11:53 |
Reporter: | Sean Lee | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S5 (Performance) |
Version: | 5.1.44-ndb-7.1.4b-cluster-gpl | OS: | Linux (ubuntu 8.04 64bit 2.6.24-24-server edition) |
Assigned to: | CPU Architecture: | Any | |
Tags: | Memory, swap, SwapCached |
[30 Aug 2010 9:40]
Sean Lee
[30 Aug 2010 10:12]
Gustaf Thorslund
Sean, > 8171 root 20 0 6953m 6.7g=6256 S 2 85.9 279:09.76 ndbmtd So your ndbmtd is using almost 7G. Your config.ini would be needed to explain why that's the case. If you have other applications running on the same host those could have caused the swappping. From the line above it doesn't appear to be ndbmtd that's got swapped out (and that's good). /Gustaf
[30 Aug 2010 10:40]
Sean Lee
Attached the ndb_error_reporter data file bug-data-56358.tar.gz
[30 Aug 2010 18:07]
Daniel Smythe
Looking through the configuration of this cluster, it appears that ndbmtd is using an appropriate amount of memory. Using the memory calculation found here: http://forums.mysql.com/read.php?25,382163,382218#msg-382218 And the defaults for missing values here: http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-params-ndbd.html I come up with the following: DataMemory = 4800 M IndexMemory = 600 M BackupDataBufferSize = 16 M BackupLogBufferSize = 4 M DiskPagebufferMemory = 384 M SharedGlobalMemory = 384 M (MaxNoOfConcurrentIndexOperations + MaxNoOfConcurrentOperations + MaxNoOfConcurrentTransactions + MaxNoOfOrderedIndexes + MaxNoOfTables + MaxNoOfUniqueHashIndexes) * 1k ( 8000 + 100000 + 4096 + 2048 + 4096 + 512 ) * 1k === 118 M RedoBuffer = 48 M TotalSendBufferMemory = 256 K UndoDataBuffer = 16 M UndoIndexBuffer = 2 M Total == 6.372 G This appears to be very close to the current memory usage of ndbmtd. I would recommend monitoring the memory usage to be sure it's not growing, but so far this does not appear to be a bug. Also, you may want to look into LockPagesInMainMemory: http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-ndbd-definition.html#ndbparam-ndbd-lo... But it appears you are already using it.
[31 Aug 2010 8:34]
Alex ldp
we got the same problem and not resolved till now the key point is the total memory that all apps used has not reached the size of physical memory why swap out these pages ? they are both kept in physical memory and swap file
[31 Aug 2010 8:51]
Sean Lee
Hi, Thanks for you reply. The memory usage and LockPagesInMainMemory=1 sounds right in the environment. That's OK. But my point is why have so many swap space is used. As you know the "SwapCached" value in /proc/meminfo means: Memory that once was swapped out, is swapped back in but still also is in the swapfile (if memory is needed it doesn't need to be swapped out AGAIN because it is already in the swapfile. This saves I/O) The archive is here: http://lwn.net/Articles/28345/ So SwapCached: 7421452 kB means there is almost 7G memory pages in both memory and swap space. And the ndbmtd is the only process on current server which uses so many memory. The "ps aux | sort -k6,6nr | head -n 5" output is: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 28847 3.6 85.9 7055164 7039940 ? SLl Aug24 366:40 /usr/local/mysql/bin//ndbmtd --ndb-nodeid=2 --ndb-connectstring=192.168.12.110:1186 mysql 4877 2.6 4.8 643056 398692 ? SLl Aug06 955:41 /usr/local//mysql/bin//mysqld --defaults-file=/usr/local/mysql/etc/my.cnf root 5003 0.0 0.1 66812 13628 ? SLs Aug06 11:36 heartbeat: master control process nobody 5053 0.0 0.0 60208 7024 ? SL Aug06 1:16 heartbeat: write: bcast eth1 nobody 5054 0.0 0.0 60208 7024 ? SL Aug06 1:16 heartbeat: read: bcast eth1 only the ndbmtd process uses the most memory, the second process use just approx 390M. So can we think that just the ndbmtd process caused the swapping? And when I shutdown a ndb node in the cluster, the host's swap used size will be almost 0, then start the ndb node the size will increasing slowing. Is that a proof ndbmtd cause this? So what's confused me is that why have the swapping and used approx 7G? It's abnormal even LockPagesInMainMemory is set to 1, isn't it?
[31 Aug 2010 19:56]
Daniel Smythe
Even though the host may be swapping, LockPagesInMainMemory is doing as it should and locking ndbmtd in memory. Having a large amount of memory locked will cause the OS to have to potentially swap other things that wouldn't normally have been swapped. I'd recommend looking into OS level options for swap/memory management and tuning, or how the OS manages its memory. Marking this as not a bug.
[1 Sep 2010 7:50]
Sean Lee
Hi Daniel, Thanks for your explaining, but it confused me more. If as you said: LockPagesInMainMemory is doing as it should and locking ndbmtd in memory. Having a large amount of memory locked will cause the OS to have to potentially swap other things that wouldn't normally have been swapped. As you know, the linux OS can't tell us which process use the swap and how many is using, the status of nswap and cnswap in /proc/pid/stat is not maintained, the values always are zero. So, I only can deduce it from the other information, which as I was mentioned the SwapCached value, the value is too big and approx to ndbmtd memory usage. I can not accept the guess that other processes cause the swapping, because of other processes have not enough virtual memory address space to host memory pages both in memory and swap space. I do study that some OS level options for swap/memory management and tuning, or how the OS manages its memory. But I can't find any clues to explaining the issue. So, I insist on this maybe a bug of mysql cluster, or you can explain it clearly for me. Thanks for your patience again.
[13 Sep 2010 19:37]
MySQL Verification Team
This is still "not a bug". Unless you are able to show that VSZ for the ndb(mt)d process is growing over time I don't see any evidence that there is a memory leak involved. The SwapCached value shows the amount of memory has been moved out of physical memory and back in but linux leaves these pages cached in swap memory. This space is left in swap the event that those pages need to be swapped back out to disk Linux can simply free the physical memory pages without having to perform the costly disk IO of writing them back to swap. Having a very large SwapCached value means that most of swap is in fact now available but one or more processes had been swapped out at some point in the past. Your /proc/meminfo indicate only about 28M of "active" swap usage. There are two possible explanations for your previous swap usage: It is possible that some other processes memory got swapped out to disk either during ndbd startup or some latter point and ndbd never touched swap. -or- Since you are using the option LockPagesInMainMemory = 1; ndbd will allocate all the memory it is going to up front before it is locked to physical memory. It is possible that some ndbd pages got swapped to disk during startup then moved back to memory when the pages were locked. These could remain in SwapCached but are untouched after startup completes. Using LockPagesInMainMemory = 2 would require ndbd to lock itself to physical memory before performing any allocations and prevent it from having any swap usage at any point in the process lifecycle.
[16 Sep 2010 9:22]
Sean Lee
Hi Matthew, Thanks from your reply very much. But I still think this is a "performance" bug, sorry. If I can prove there is memory leak in the ndb(mt)d, that will be a critical bug, do you think so? Thanks for your explaining on the "SwapCached" term. I realize that you are right if there are many process just swap off a few pages and then exit, this also can make a very large SwapCached value, because OS reclaim swap in a lazy way. But in my environment, I check your guesses by trying to do some test. At first I try to use LockPagesInMainMemory = 2 this swap size is zero at the start, then it is increasing slowly to appx 5G(not 7G) and no more growing. So I assume there is another processes do some swap make this, I stop the mysqld in one of the two host and swapoff then swapon again to swap used is zero again. Then the swap size is growing again, I then have tried to decrease the DataMemory value from 4800M to 4000M, make the memory have more freed, the swap size is also increasing to appx 2G and no more growing. The host is a dedicated server only for mysql cluster, except the mysqld and ndbmtd, the other process are some trivial daemons such as webmin , heartbeat and sshd which are common and necessary for a linux server. After I have stop the mysqld the ndb(mt)d is the only no-trivial processer, so I don't think the other process make the so large swap even I have 800M addition memory freed for the test. Can anybody explaining the weird issue?
[27 Sep 2010 9:00]
Gustaf Thorslund
Hi Sean, If you think this is a bug, could you please provide a test case showing how to reproduce it and showing what impact it has on performance? From what I can tell so far this looks like a bug in your way of looking at SwapCached. Matthew have already done a serious attempt on working around this little bug. If you have further concerns on this I would suggest you seek advice in forums, mailing lists, irc, or open a support issue. /Gustaf
[29 Sep 2010 14:20]
Sean Lee
Hi Gustaf, Thanks for your reply, maybe I misunderstand the swapcached means. OK, that's all right. As you know I cann't repeat the issue on my test environment servers, the differences between test and production servers are RAID, CPU and OS version. The production servers have a 4-core cpu and RAIDS 1+0 and Ubuntu 8.04, and the test servers have a 2-core cpu and no RAID and on Ubuntu 10.04. I still have no idea what make the server swap so much. As you said, I had tested the swap impact on performance. First I run the sysbench like this: sysbench --mysql-user=root --test=oltp --mysql-host=host --mysql-password=pass --oltp-test-mode=complex --mysql-table-engine=ndbcluster --oltp-table-size=20000000 --mysql-db=ndb --num-threads=100 --max-requests=0 --max-time=60 --mysql-create-options="TABLESPACE ts_1 STORAGE DISK DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci" run On the normal production severs 10 times, with the swappiness value = 60(default). In the test the swap show used appx 5.0G, after I run "swapoff -a ; swapon -a" already. Then I set the swappiness value = 0, and swappoff and swapon again, sysbench 10 time too. In the tests the swap used only 152K. Finally, I reduce the DataMemory and IndexMemory approx a half the current value, and run the bench 10 time again. In the tests no swap used. I extract the read/write requests per sec value as the sample, and get the average value of the 10 times test. The result is: swappiness 60(default) with the current size: 706.794 read/write requests per sec. swappiness 0(default) with the current size: 793.553 read/write requests per sec. swappiness 60(default) with the half size: 788.341 read/write requests per sec. It shows the swap maybe impact approx 10-15% performance reducing? Gustaf, thanks you and Matthew's works on the litter bug. I don't think I am a biased man. I just think the issue confused me so much, maybe somebody can give me any explaining or help. Of course, you have the right decide the issue whether or not a bug. I just do what I can do to seeking for help and give some information for make mysql better. A more question: if set the swappiness = 0 can reduce the swap used(from 5G to 152k), is it means the ndb(mt)d attempt to swap so agressively? or LockPagesInMainMemory=2 not works for me on default swappiness(60)? Thanks for your patience again.
[29 Sep 2010 14:21]
Sean Lee
sysbench output on swappiness = 0
Attachment: no_swap.out (application/octet-stream, text), 15.38 KiB.
[29 Sep 2010 14:23]
Sean Lee
sysbench output on swappiness is default 60
Attachment: has_swap.out (application/octet-stream, text), 15.38 KiB.