Bug #13578 | The longer mysqld runs, the slower everything is | ||
---|---|---|---|
Submitted: | 28 Sep 2005 17:10 | Modified: | 4 Sep 2006 7:08 |
Reporter: | Shawn Hogan | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Server | Severity: | S2 (Serious) |
Version: | 5.0.21, 4.1.18 | OS: | MacOS (Mac OS X Server 10.4.4) |
Assigned to: | CPU Architecture: | Any |
[28 Sep 2005 17:10]
Shawn Hogan
[28 Sep 2005 17:11]
Shawn Hogan
4 days of CPU usage (arrows show where mysqld process was restarted)
Attachment: mysql_cpu_usage.gif (image/gif, text), 72.62 KiB.
[28 Sep 2005 17:12]
Shawn Hogan
Output of SHOW STATUS and SHOW VARIABLES immediately before a needed restart
Attachment: mysql_probs.txt (text/plain), 29.08 KiB.
[28 Sep 2005 17:14]
Shawn Hogan
Also, more than happy to give any MySQL developer a shell account to the server for anything.
[6 Oct 2005 14:25]
Valeriy Kravchuk
Thank you for a problem report. I've read the forum threads you mentioned. It was interesting. Did you noted that at the second one a real problem (cartesian join) was pointed out? So, please, give us more information on how to repeat this situation. The results of SHOW PROCESSLIST commands (may be, several of them) just when you have this hang and just after the restart may help, for the beginning. You may ask developers to login remotely and try to solve you problem, but this is an option of our Gold support, as far as I remember.
[6 Oct 2005 15:16]
Shawn Hogan
It's not so much a hang of the copying to a tmp table (although that happens too). Rather every query and every query type get slower as time goes on (for example, normally it takes 0.01 seconds to truncate a HEAP table. If the server runs for 24 hours, it can take 10 seconds, and just gets slower the longer it runs). For example, I can take the mysql server off the network (no mysql clients making any queries) after it's been running for awhile and every type of query is slow with that query being the only query going on. The only way to solve it is to restart the mysqld process (not the server). But I will attach a couple odd looking processlists I've randomly run across trying solve this issue.
[6 Oct 2005 15:18]
Shawn Hogan
Shows almost every mysqld thread hung on opening/closing files
Attachment: processlist1.txt (text/plain), 154.10 KiB.
[6 Oct 2005 15:19]
Shawn Hogan
All query threads with a NULL state
Attachment: processlist2.txt (text/plain), 148.22 KiB.
[6 Oct 2005 15:21]
Shawn Hogan
No state on any query thread except for one that's doing an INSERT into a tiny HEAP table
Attachment: processlist3.txt (text/plain), 23.46 KiB.
[11 Oct 2005 5:06]
Shawn Hogan
I was able to pinpoint the problem (at least partially). It's the query_cache. My query cache was set to 256MB, and I noticed that all databases would hang whenever the query cache needed to be purged of a certain table's data (on an update of a table). With 256MB allocated for the query cache, sometimes the garbage collection of the query cache would be as "fast" as 10 seconds (see proccesslist3.txt) and other times it would take more than 60 seconds (see processlist1.txt). The smaller I set the query_cache_size, the better the overall performance (currently I have it running at 2MB). Performance still isn't where it is when mysqld is first started, but at least performance is good enough that it doesn't HAVE to be restarted. I'm thinking of dropping the key_buffer_size from 256MB to 2MB also and see if I get the rest of the missing performance. Either way, at the very least it definitely seems like a problem with the query_cache (when queries are flushed out) on Mac OS X Server.
[12 Oct 2005 13:20]
Valeriy Kravchuk
Thank you for additional information. Large query_cache will not help you in case of really defferent queries or queries with some functions like I saw in the processlists you posted. See http://dev.mysql.com/doc/mysql/en/query-cache-how.html for details. So, you may even get better performance switching it off at all. (It takes time to manage it and to look it through, just to not found anything useful. Alhtough, your query chache related status variables does not looked to bad for me...) Please, try and inform about the results. key_buffersize is a different story, so I do not think it should be decreased. I changed severity, because your server can really work without restarts now.
[13 Nov 2005 0:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[13 Dec 2005 18:31]
Shawn Hogan
I ended up setting the server up to text message my cell phone whenever mysqld starts spinning out of control (which these days is about 36 hours between mysqld restarts). After watching it closely for the last month, I think I've finally figure out what the cause is. As time goes on, mysqld's memory get so segmented that queries take longer and longer as it has to start accessing thousands of different memory segments for each query. This is what top is showing currently for the process (restarted <5 hours ago): PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE 25879 mysqld 16.5% 80:45.94 >> 329 1097 202M+ 2.33M 195M+ 376M+ So there are more than 100 threads (not sure how many), 329 Mach ports and 1,097 different memory segments. When I had to restart mysqld early this morning, there were more than 100 threads (not sure how to see how many when it goes over the allowed columns), more than 1,000 Mach ports and 9,600 memory segments (this was after about 20 hours of running). The machine is a fully loaded top of the line (as of this writing) Xserve with 5GB of RAM and no other processes running (beyond normal system processes). It's not virtual memory/disk thrashing or anything (mysqld is using ~200MB) and the machine is showing more than 4GB of inactive physical memory. If this is something that can't be fixed (because it's a core operating system issue), it would be good to know, that way we can stop using Mac OS X as a MySQL server (it's terribly impractical to have to restart your db server daemon every 24-48 hours.
[17 Dec 2005 14:22]
Valeriy Kravchuk
As it may be an OS issue, can you, please, try to upgrade your server to Tiger (Mac OS X 10.4.3)? Also, please, try 4.1.16 available for some time already.
[11 Jan 2006 19:07]
Shawn Hogan
I've upgraded it to OS X Server 10.4.4 and MySQL 4.1.16, and I'm still seeing the same problem. For the time being I've written a script that checks the slow queries every 15 minutes and restarts the mysqld process once it's more than 500 slow queries (with slow query time of 5 seconds). I never get a single slow query for the first 3-4 hours of mysqld running, then it starts snowballing and gets slower and slower (the auto-restart usually happens around 24 hours of uptime).
[15 Feb 2006 10:55]
Valeriy Kravchuk
Please, try to use newer version, 4.1.18 (4.1.16 had some problem with UPDATE and DELETE optimization that can lead to wastage of resources) and inform about the results. Your my.cnf may be useful also - we can try to allocate more memory to MySQL and check for any difference. Do you have any queries with large [NOT] IN conditions (hundrends or thousands of values to check against)?
[17 Feb 2006 19:28]
Shawn Hogan
4.1.18 seems to be no different... still requiring that the mysqld process be shutdown about every 24 hours as it's performance degrades due to more and more unique memory segments it's using. For the most part, the queries running on the DB server are very simple (1 record returned, no joins, hitting index properly [no table scans], etc.) As far as memory allocation goes in my.cnf, the more memory I take AWAY from things, the better it runs (I guess it doesn't need to allocate and deallocate as much memory so it doesn't get fragmented as quickly).
[19 Feb 2006 9:48]
Valeriy Kravchuk
Anyway, may I take a look at your my.cnf? Do you use query cache?
[20 Feb 2006 1:37]
Shawn Hogan
When I first realized it's a memory fragmentation issue, I ended up lowering the query cache down to 4MB (from 256MB). The more memory allocated for things in the my.cnf file, the quicker the memory gets fragmented to the point mysqld is unusable. Also, 99.5% of the queries have no gain from the query cache, so those ones all use the SQL_NO_CACHE directive in the SELECT statements. [mysqld] skip-external-locking log-slow-queries long_query_time=5 max_heap_table_size=16M ft_min_word_len=3 max_connections = 1000 key_buffer_size = 256M myisam_sort_buffer_size = 128M join_buffer_size = 262144 read_buffer_size = 262144 sort_buffer_size = 2M table_cache = 1800 thread_cache_size = 512 connect_timeout = 10 max_tmp_tables = 64 tmp_table_size = 32M max_allowed_packet = 4M read_rnd_buffer_size = 524288 max_connect_errors = 10 thread_concurrency = 4 query_cache_limit = 2M query_cache_size = 4M query_cache_type = 1 query_prealloc_size = 16384 query_alloc_block_size = 16384
[25 Mar 2006 23:19]
Shawn Hogan
Here's something I ran across randomly, and sounds like maybe the same problem? http://www.linuxjournal.com/article/7451 <i>Of note, Memcached uses a slab allocator for memory allocation. Early versions of Memcached used the malloc from glibc and ended up falling on their faces after about a week, eating up a lot of CPU space due to address space fragmentation. A slab allocator allocates only large chunks of memory, slicing them up into little chunks for particular classes of items, then maintaining freelists for each class whenever an object is freed.</i>
[12 May 2006 6:11]
Valeriy Kravchuk
Yes, it can be the same problem, but as glibc is used by large MySQL instances on other platfroms (Linux) and we had not identified exactly same problem there (yet), I am not sure. As new version, 4.1.19, released, I suggest you to try it (64-bit version!, http://dev.mysql.com/get/Downloads/MySQL-4.1/mysql-max-4.1.19-apple-darwin8.2.1-powerpc-64...). Please, inform about differences (if any).
[12 May 2006 6:17]
Shawn Hogan
I've actually upgraded to 5.0.21 and have been experiencing the same problems (the mysqld process needs to be restarted every 4 or 5 hours). I'm currently thinking about putting together a workaround that basically runs 2 copies of mysqld and alternating restarting each process every hour or so (that way one process is always available, although on a different port). Not a "good" solution, but at this point I have no choice but to restart the mysqld process multiple times per day (via a monitoring script).
[12 May 2006 6:39]
Valeriy Kravchuk
Have you tried with 64-bit version of 5.0.21?
[12 May 2006 6:48]
Shawn Hogan
Yep, that's what I'm using.
[4 Aug 2006 7:08]
Valeriy Kravchuk
Please, try to use a newer version, 5.0.24, just released, and inform about the results.
[22 Aug 2006 21:55]
Phil Tobias
We are experiencing the same symptom on Solaris 10 OS running 4-AMD 64bit processors with 10GB real memory. MySQL version is 5.0.18 64 bit. The server is dedicated to MySQL and performs data warehouse/data mart queries. Performance is excellent immediately after a reboot, but deteriorates rapidly. "Flush tables" command restores performance to almost normal, but within 1-2 hours it is crawling again. Queries that run in 10 seconds after reboot run for 10-15 minutes when performance degrades. The system has plenty of RAM and is not swapping. Iostat shows some activity, but generally the disks are not very busy. All tables are MyISAM, query_cache_type=2 (Demand) to bypass query cache unless requested because queries are 99+% unique. Comment added to this bug report because of note on May 12th relative to memory allocation and lack of evidence of this problem on other platforms.
[4 Sep 2006 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".