Bug #914 | MyISAM INSERT blocking SELECT | ||
---|---|---|---|
Submitted: | 24 Jul 2003 17:32 | Modified: | 23 Jan 2004 19:35 |
Reporter: | Paul Coldrey | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: MyISAM storage engine | Severity: | S2 (Serious) |
Version: | 4.0.14 | OS: | Linux (RedHat 9.0) |
Assigned to: | Alexander Keremidarski | CPU Architecture: | Any |
[24 Jul 2003 17:32]
Paul Coldrey
[24 Jul 2003 20:08]
Byron Kidd
I've been following the thread on bug #712 closely as I am also almost ready to take the plunge and upgrade from 3.23 to reap the rewards of improved full text searching and replication. Investigation of bug #712 (and now #914) has stalled my enthusiasm. Using the test case provided by Paul Coldrey on 4.0.14 under Windows XP Professional (SP1) I execute the SELECT (mysqlbug3.sql) while the INSERT (mysqlbug2.sql) is running at which point the SELECT blocks and no results are returned. Once the INSERT finally does complete the SELECT returns its results in an instant. Unlike the results reported in the post above I never actually see the SELECT status as 'locked' in the process table. Infact after numerous attempts I haven't actually seen the SELECT in the process table at all!! The only process information I receive is : +-----+------+----------------+-------+---------+------+--------------+--------- -------------------------------------------------------------------------------- -------------+ | Id | User | Host | db | Command | Time | State | Info | +-----+------+----------------+-------+---------+------+--------------+--------- -------------------------------------------------------------------------------- -------------+ | 52 | ODBC | localhost:1571 | NULL | Query | 0 | NULL | show pro cesslist | | 124 | ODBC | localhost:1699 | bugme | Query | 17 | Sending data | INSERT I NTO MenuPermission(SequenceId, MenuId, PeopleId) SELECT 1, Menu.Id, GroupMemberLookup.Perso | +-----+------+----------------+-------+---------+------+--------------+--------- -------------------------------------------------------------------------------- -------------+ Although I have yet to test my own applications under 4.0.*, I have been able to reproduce the bug given the test procedure provided.
[25 Jul 2003 3:12]
Tobias Lind
Be aware that it's normal behavior for INSERTS to block SELECTS with MyISAM-tables. Please run OPTIMIZE and do not delete any rows before running a test-case to show this bug. If SELECTS are still blocked, then we have an issue here, and it's serious. I think many users have optimized their application to benefit from non-blocking INSERTS in a table without holes. In my application I always mark rows in a huge table instead of deleting them right away. Later (about once a week) I run a script that removes all marked rows at once and runs OPTIMIZE... /Tobias Lind
[25 Jul 2003 5:46]
Sergei Golubchik
I am not sure I understand "How to repeat" section. There're three files in the archive:mysqlbug1.sql mysqlbug2.sql mysqlbug3.sql. Am I supposed to load data with mysql < mysqlbug1.sql and then do insert with mysql < mysqlbug2.sql and while insert is working try to select with mysql < mysqlbug3.sql ? If yes, then I wasn't able to repeat the bug - mysqlbug3 was happily executed concurrently with mysqlbug2.
[25 Jul 2003 7:17]
Alexander Keremidarski
Paul, First of all thank you for your efforts and especially for good test case. However it was not good enoug to expose problem. I had to modify it before I was able to repeat "Locked" state. With your test saw locked SELECT, but it happens once in a while, not everytime. Interestingly your test case can be used for demonstration how Query cache works. Your SELECT query is perfect candindate for caching. First thing is to make sure QCache is off otherwise your SELECT will be retrieved from Qcache at about 99.99% of time. This by the way can explain Byron Kidd's observation. Cached queries return result immediately so yiou can;'t see them in SHOW PROCESSLIST output. I ran your INSERT in a loop and at another terminal I ran SELECT in a loop while true; do mysql BugMe < mysqlbug2.sql; done while true; do time mysql BugMe < mysqlbug3.sql; done All times were within 0.02 - 0.5 seconds range and no Locked in SHOW PROCESSLIST Then I tried to delete most of rows in MenuPermission without stoping above loops and I found that *sometimes* there are peaks in select time up to 3 seconds. Once (just once) I saw select running for 2.5 minutes! and yes at this moment in SHOW PROCESLIST I was able to see select is Locked. Once! All of the above is with Query Cache turned Off and single-session Select. With Query Cacne on select times are within 0.01 - 0.04 seconds range. At next step I modified your test case in order to satsify: "Once the new permissions have been fully inserted the previous permissions (with a lower SequenceId) are deleted." As result SELECT query is Locked at the moment of DELETE which is normal. For subsequent INSERTs SELECT works as before. However repeating above procedure (INSERT then DELETE) leads to strange situation when it looks like there is Stalled lock hich remains after DELETE unbtil subsequent INSERT runs. I have 2 SELECTing loops at thsi point and as per SHOW PROCESSLIST I can see states like "Locked", "System lock" and "Waiting for table". SELECT time now varies betwen 0.024 sec and 6min 51.219sec ! So this is something to be investigated further. Just for the record I tested with debug mysqld built from bk tree (post-4.0.14) I'm running RedHat 9 too.
[4 Aug 2003 6:10]
Paul Coldrey
As a quick update for anyone who may be working on this.... it appears the problem may be related to RedHat 9.0. Firstly I moved all the tables that might suffer from concurrency issues to InnoDB and this fixed some problems but the speed was still dismal. I then installed an old machine with RedHat 7.3 and did some tests with both 4.0.14 and 3.23.57. Performance is similar for both versions. Despite the machine being low-spec the performance was significantly better than my high-end server running 4.0.14 on RedHat 9 (40-50 seconds compared with 7-8 minutes!). Also of interest is that the speed is still poor even when only one process is accessing the database. I hope to tie down a neat test case in the next couple of days.
[4 Aug 2003 23:17]
Lenz Grimmer
Thanks a lot for your investigation! Yes, we too noticed degraded performance when using the Red Hat 9 kernel - it seems like its swapping behaviour is suboptimal. Make sure to use "vmstat" to monitor the memory situation. Do you run the latest Red Hat errata kernel? They seem to be working on this (notice the bug IDs below): # rpm -q --changelog -p kernel-2.4.20-19.9.src.rpm | head warning: kernel-2.4.20-19.9.src.rpm: V3 DSA signature: NOKEY, key ID db42a60e * Wed Jul 16 2003 Arjan van de Ven <arjanv@redhat.com> - add proc and execve fixes - update ide drivers to fix critical bugs * Sat Jul 12 2003 Rik van Riel <riel@redhat.com> - upgrade to latest -rmap to fix #89226, #90668, etc.
[7 Aug 2003 20:34]
Paul Coldrey
Just tried it with kernel 2.4.20-19.9 (I think that's the latest and greatest) and there was no change - still seriously slow. I don't want to be a pain but is there a plan on how to fix this - do you know if it's a RedHat problem or whether it's common to all 2.4.20 kernels? Either way I'd guess that it effects a fair portion of your users. It's now nearly 7 weeks since I reported the bug (and about 9 weeks since I first stumbled across it) and I'm starting to get a lot of grief from people higher up the chain..... I need to tell the something!
[7 Aug 2003 21:27]
Paul Coldrey
A little more... I decided to take as many variables out the equation as possible so that I could generate a simple test case. Hence, I had PHP dump all the SQL commands it was running, to a file, so that I could run the scripts from a command line and show some combinations of MySQL/RedHat taking 30 seconds and others taking 5+ minutes. Sadly, (at least in the interests of getting this fixed!) from the command-line I couldn't replicate the problem! Hence, it's only slow when run from PHP 4.3.2 on RedHat 9 with any MySQL 4.0 or 4.1..... BUT!!!!!! If I compile PHP 4.3.2 with MySQL 4.0.14 on RedHat 7.3 it runs a treat. Hence, perhaps the finger should be pointed at a specific version of GLIBC????
[8 Aug 2003 1:07]
Lenz Grimmer
Hi Paul, thanks for providing us more details. Sorry, but I don't understand "it runs a treat" - does this mean it works as expected, or is it still slow? Can you try to catpture the output of "vmstat 2", while the slowness occurs? It would be interesting to see, if the kernel is busy shuffling memory around. What about the output of "top" - do you see that mysqld is hogging the CPU, when it takes so long to return the query results? The glibc C library on Red Hat 9 _might_ be a factor here. Do you use our statically linked Linux binary, or did you roll your own? I could not find this info in this thread. If it's a dynamically linked binary, does using "export LD_ASSUME_KERNEL=2.2.5" before starting the applications (mysql/PHP) make a difference? This variable forces the application to use the "old" LinuxThreads implementation instead of NPTL from glibc 2.3.2.
[10 Aug 2003 22:53]
Paul Coldrey
Hi Lenz, "Works like a treat" is Australian for "worked correctly" - hence I meant that the slowness was not seen in this case... unfortuantely we speak with so many colloquialisms that I don't always succeed in editing them all out :-) I am running the standard RPM versions of 4.0.14. I have compiled PHP using both the "in-built" MySQL support and using the installed libraries (--with-mysql and --with-mysql=/usr) and both perform the same way. I have included at the bottom of this post the output from vmstat... it seems pretty normal - there are a couple of bursts of swapping but they only last a couple of seconds. The output from top also seemed OK.. the general stats looked like: Size: 28824 RSS 14M SHARE 2123 STAT D CPU 2.5% Mem 1.4% CPU 1 At the point marked in the vmstat output this changed to numbers more like: Size 29000 RSS 16M Share 2288 CPU 30% Mem 1.6% And CPU stayed between 15% and 35% for the rest of the execution. Some other testing I tried: 1. As suggested, I set "export LD_ASSUME_KERNEL=2.2.5" before starting MySQL... still slow. When I tried to restart Apache/PHP with this set it wouldn't start and I got "(38) Function not implemented: Couldn't create accept lock" in the Apache error log. 2. I upgraded my 7.3 test machine to 9.0, ran the test suite and surprisingly enough it ran at full speed! It appears to be running the same kernel version and the same glibc version (2.4.20-8 and 2.3.2). The server that is causing troubles was using an SMP kernel so I changed this to single processor and the problem still persisted. I then uploaded the full database to the machine that had been running 7.3 (including all the tables not required by the code that runs slowly) and the execution time went up to 4 minutes. 3. On a whim, I ran the test on a Windows XP machine and I gave up waiting for it to finish at 10 minutes. The CPU usage hovered between 2% and 8% and there was 200Mb of free physical memory and yet the disk thrashed. After 10 minutes I stopped the browser page loading and the browser closed normally but the disk continued to thrash for at least 5 minutes and the mysqld-nt process kept using arounf 3% CPU. I opened up the my.ini file in the c:\WINDOWS directory to see if there were any rediculous settings which might make it overly disk intensive and found a whole section that I was meant to uncomment if using InnoDB (which I am). I also found a note at the top that said to make the setting effective I had to copy it to c:\my.cnf, so I'm guessing that it was running with the default configuration. This isn't the intended behaviour running InnoDB with default settings,... right?? Hence the summary seems to be: * InnoDB and MyISAM servers both have degraded performance * Our problem doesn't appear to be due to OS virtual memory issues (since there is almost no paging) * database size appears to have an effect even when the tables in the query do no change size. * This may be OS dependant but since it occurs on RedHat 9.0 and Windows XP, the OS probably isn't at the crux of the problem. * On two RedHat 9.0 machine my low spec machine out-performs my high spec machine which may suggest a problem with software-RAID code (or something similar)..... but this doesn't explain the XP performance * Perhaps the myriad of seemingly contradictary evidence suggests that there is more that one bug? I've now spend a few full days trying to help track down this bug and I cannot afford to spend any more... I quipped about moving to PostgreSQL earlier in this whole saga but now that is being examined seriously as a contingency plan. Honestly, I don't want to do this, but the pressure is mounting.... can do you guys give me any idea when you planning to look at this problem. I, personally, would like to wait until you have fixed the bug that you have found and see if it solves my problems, however, if this is not going to happen in the next fortnight then I may be forced to follow one of our contingency plans. procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 0 85852 18136 189516 721220 0 1 8 9 67 33 2 1 8 0 0 0 85852 18136 189516 721220 0 0 0 0 121 21 0 0 100 0 1 1 85852 17736 189680 721340 38 0 98 428 329 776 2 1 97 0 1 1 85852 17268 189976 721440 40 0 90 792 502 1429 2 2 96 0 1 1 85852 16864 190300 721452 32 0 38 888 579 1104 2 4 94 0 1 1 85852 16500 190596 721460 30 0 34 826 512 1168 1 3 96 0 1 1 85852 16088 190976 721476 8 0 16 792 493 1046 2 1 97 0 1 1 85852 16476 191284 721660 86 0 90 1030 564 1245 1 3 96 1 1 1 85852 16116 191680 721624 0 0 6 824 498 1193 1 2 97 0 1 1 85852 15792 192044 721580 2 0 6 820 503 1081 1 1 98 0 1 1 85852 15488 192340 721592 2 0 8 898 540 1133 2 2 96 0 1 1 85852 15400 192592 721600 0 0 4 782 485 1000 1 1 97 0 1 0 85852 14856 192984 721652 0 0 6 980 543 1099 1 3 96 0 1 1 85852 15632 193484 721660 0 0 4 840 510 1126 1 2 97 0 1 1 85852 15220 193888 721668 0 0 4 818 504 1201 1 2 97 0 1 1 85852 14740 194356 721680 0 0 6 886 529 1268 1 2 97 0 1 1 85852 14332 194752 721692 0 0 6 780 485 1006 1 3 95 0 1 1 85852 15100 195172 721740 16 0 20 1014 531 1107 2 1 97 0 1 1 85852 14608 195648 721752 0 0 6 860 520 1235 0 1 99 0 1 1 85852 14244 196004 721760 0 0 4 812 499 1168 2 2 96 0 1 1 85852 14700 196268 721772 2 4 8 836 506 1370 1 1 98 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 1 1 85852 14428 196532 721780 0 0 4 822 510 1235 1 2 97 0 1 1 85852 14132 196780 721824 0 0 6 984 518 1387 1 1 97 0 1 1 85852 13904 196988 721844 0 4 4 816 498 1159 1 2 97 0 1 1 85852 14752 197180 721944 0 0 4 828 506 1099 1 2 97 0 1 1 85852 14236 197384 721960 0 0 6 844 521 1329 2 1 97 0 1 1 85852 13872 197580 722040 0 4 4 814 498 1221 1 1 97 0 2 1 85852 13904 197828 722104 0 4 32 878 507 1188 3 3 94 0 1 2 85852 13352 198104 722136 0 4 6 998 528 1240 1 2 97 2 0 0 85852 13204 198312 722160 0 4 4 778 487 1102 2 3 95 0 1 1 85852 13076 198460 722180 0 4 4 834 519 1096 1 2 97 0 1 1 85852 12888 198548 722280 0 4 6 832 506 1165 1 1 97 0 1 1 85852 12780 198728 722288 0 0 4 846 518 1111 1 2 97 0 1 1 85852 12640 198832 722332 16 0 22 1062 529 1167 2 2 96 0 1 1 85852 12484 198964 722352 0 4 6 808 506 1022 1 1 97 0 1 1 85852 12344 199144 722344 0 4 4 888 534 1152 1 1 97 0 1 2 85852 12348 199144 722340 0 0 6 840 521 1223 2 1 98 0 1 1 85852 12320 199172 722336 0 0 4 806 493 1162 1 1 97 0 1 1 85852 12220 199332 722268 0 0 6 1060 552 1251 2 1 97 1 0 1 85852 12156 199556 722296 0 4 4 848 512 1143 1 1 98 0 1 1 85852 11900 199752 722368 0 4 6 866 521 1129 2 2 96 0 1 1 85852 12060 199964 722376 0 0 4 814 505 1056 2 10 88 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 1 1 85852 11848 200176 722388 0 0 6 828 508 1467 2 2 96 0 1 1 85852 11552 200364 722500 0 4 4 1014 555 1143 1 2 96 0 1 2 85852 11816 200584 722520 0 0 6 864 523 1087 1 2 97 0 1 0 85852 11660 200740 722520 0 0 4 752 501 941 0 2 98 0 1 1 85852 11532 200872 722508 0 0 4 822 508 1023 1 1 98 0 1 1 85852 11532 200872 722508 0 0 6 798 500 1374 2 2 97 0 1 1 85852 11532 200872 722392 0 0 4 1024 514 1211 0 2 98 0 1 1 85852 11460 200980 722388 0 0 6 832 512 1216 1 1 98 0 1 1 85852 11456 201064 722320 0 0 4 808 484 1164 2 2 96 0 1 1 85852 11288 201220 722328 0 0 4 886 536 1231 1 1 97 0 1 1 85852 11480 201296 722336 0 0 6 820 514 1083 1 1 97 1 1 1 85852 11432 201344 722180 24 0 28 1162 555 1212 1 2 97 0 1 1 85852 11308 201464 722192 0 0 6 780 486 1022 1 3 96 0 1 1 85852 11192 201572 722200 0 0 4 884 534 1157 2 2 96 0 1 1 85852 11180 201572 722212 0 0 6 850 527 1123 0 2 98 0 1 1 85852 11152 201592 722220 0 0 4 818 501 1023 2 1 97 0 1 1 85852 11132 201592 722260 0 0 4 1018 538 1150 2 2 96 0 1 1 85852 11120 201592 722268 2 0 6 816 504 1028 1 1 98 0 1 2 85852 11040 201664 722280 0 0 6 822 509 1066 1 2 97 0 1 1 85852 11032 201664 722288 0 0 4 822 504 1261 0 1 99 0 1 1 85852 11232 201748 722284 0 0 6 832 501 1061 1 2 97 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 1 1 85852 11144 201836 722280 0 0 6 994 517 1085 1 1 98 0 1 1 85852 11144 201836 722272 0 0 4 896 535 1220 0 2 98 0 1 1 85852 11144 201836 722272 0 0 6 828 513 1154 1 1 97 0 1 1 85852 11144 201836 722268 0 0 4 826 506 1107 3 3 94 1 1 2 85852 11144 201836 722268 0 0 6 862 526 1101 2 2 96 0 1 1 85852 11116 201836 722108 0 0 4 1084 516 1099 1 2 97 0 1 1 85852 11116 201836 722116 0 0 4 834 526 1047 0 1 99 0 1 1 85852 11116 201836 722128 0 0 6 868 528 1105 1 1 99 0 1 1 85852 11116 201836 722140 0 0 6 778 492 1045 1 2 98 0 1 1 85852 11116 201836 722152 0 0 6 868 526 1133 2 2 96 0 1 1 85852 11084 201868 722196 0 0 6 1038 527 1131 1 1 98 1 0 1 85852 11084 201868 722204 0 0 4 774 483 1077 1 1 98 0 1 0 85852 11084 201868 722216 0 0 6 810 500 1052 1 6 92 0 1 1 85852 11084 201868 722224 0 0 4 776 482 1016 2 2 96 0 1 1 85852 11084 201868 722232 0 0 4 824 498 1017 2 1 97 0 1 1 85852 11084 201868 722284 0 0 26 924 494 1026 1 2 96 0 1 1 85852 11080 201868 722296 0 0 6 846 521 1135 2 4 94 0 1 2 85852 11068 201876 722304 0 0 4 806 492 1052 2 1 97 0 1 1 85852 11064 201876 722316 0 0 6 888 540 1184 2 2 96 0 1 1 85852 11064 201876 722324 0 0 4 848 516 1294 2 1 97 0 1 2 85852 11040 201876 722400 0 0 6 976 516 1275 2 2 96 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 1 1 85852 11036 201876 722408 0 0 4 878 532 1120 1 1 98 0 1 1 85852 11020 201876 722416 0 0 4 794 486 1044 2 3 95 0 1 1 85852 11008 201876 722428 0 0 6 816 498 1194 1 1 97 0 1 1 85852 10968 201876 722468 0 0 20 852 522 1101 2 1 97 0 2 1 85852 10952 201876 722480 0 0 6 964 511 1453 2 1 97 0 1 1 85852 10792 201876 722584 0 0 4 914 504 1366 1 2 96 0 1 1 85852 10744 201924 722592 0 0 4 836 514 1062 0 1 98 2 0 0 85852 10736 201932 722600 0 0 4 814 500 1065 2 4 94 0 1 1 85852 10728 201932 722612 0 0 6 862 530 1106 0 1 98 0 1 1 85852 10724 201932 722624 0 0 6 810 504 1087 1 2 96 0 1 1 85852 10716 201932 722632 0 0 4 986 526 1050 1 2 97 0 1 1 85852 10716 201932 722640 0 0 4 836 514 1059 1 1 98 0 1 1 85852 10712 201932 722652 0 0 6 830 519 1071 1 1 97 0 1 1 85852 10912 201932 722656 0 0 4 820 512 1082 1 1 97 0 1 1 85852 10912 201932 722652 0 0 4 804 495 1077 1 2 97 0 1 1 85852 10872 201932 722440 0 0 6 1064 512 1308 1 2 97 0 1 1 85852 10872 201932 722448 0 0 4 842 522 1078 1 3 96 0 1 1 85852 10732 202072 722460 0 0 6 842 510 1270 1 2 97 0 1 1 85852 10708 202096 722468 0 0 4 890 534 1183 2 1 97 0 1 1 85852 10172 202096 722496 0 0 22 822 509 1075 2 1 97 0 1 2 85852 10172 202096 722496 0 0 4 1064 538 1107 1 3 96 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 1 1 85852 10144 202124 722496 0 0 4 854 522 1557 1 2 97 0 1 1 85852 10048 202228 722492 0 0 6 850 516 1239 1 3 96 0 1 1 85852 9996 202280 722492 0 0 4 852 519 1177 1 0 98 0 1 1 85852 10156 202324 722488 0 0 4 822 498 1120 1 2 97 0 1 1 85852 9988 202328 722512 0 0 8 980 527 1106 1 1 97 0 1 1 85852 9988 202328 722520 0 0 4 884 531 1157 1 2 97 1 0 1 85852 9988 202328 722532 0 0 6 816 511 1074 1 2 97 0 1 1 85852 9988 202328 722540 0 0 4 834 509 1098 1 1 98 0 1 1 85852 9988 202328 722552 0 0 6 890 546 1234 2 2 96 0 1 1 85852 9988 202328 722592 0 0 4 962 499 1180 1 3 96 1 1 1 85852 9988 202328 722600 0 0 4 878 532 1113 1 2 96 0 1 1 85852 9988 202328 722612 0 0 6 816 503 1144 1 2 97 0 1 1 85852 9984 202328 722624 0 0 6 850 519 1190 1 1 98 0 1 1 85852 9984 202328 722632 0 0 4 904 544 1179 2 2 96 1 0 1 85852 9984 202328 722748 0 0 4 980 506 1113 0 1 98 0 1 0 85852 9984 202328 722744 0 0 6 814 501 1078 1 1 97 0 1 1 85852 9984 202328 722744 0 0 4 872 534 1279 1 2 97 **************************************************** -> Jump in mysqld stats in 'top' occured here **************************************************** 1 1 1 85852 9756 202328 722820 88 0 134 776 477 1208 6 2 93 2 0 0 85852 9008 202328 722664 26 0 168 590 173 190 45 1 54 2 0 0 85852 9220 202328 720904 156 0 520 90 201 403 65 1 33 1 0 0 85852 9776 202300 720140 562 4 758 4050 573 1466 48 5 47 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 1 1 85852 9576 202308 720332 0 0 484 3472 346 989 36 4 60 1 0 0 85852 9780 202312 720444 4 0 278 4280 501 1069 29 6 66 0 1 4 85852 10488 202320 719956 0 0 84 7228 669 1484 26 7 66 1 0 0 85852 9996 202320 720340 32 0 76 4136 429 847 31 6 63 1 0 0 85852 9484 202320 720660 136 0 162 4600 541 1484 29 6 65 1 0 0 85852 9480 202320 720664 0 0 78 6250 595 1276 30 4 66 0 1 1 85852 10136 202320 722036 0 0 142 8380 717 1504 22 7 71 0 1 1 85852 9832 202320 722520 0 0 80 7046 684 936 22 7 71 0 2 1 85852 9660 202320 722624 0 0 20 6838 913 2413 13 6 80 0 1 2 85852 9808 202320 722740 0 0 92 9664 1014 2703 13 10 76 0 1 2 85852 9856 202320 722800 0 0 12 9644 954 2706 13 8 79 1 0 0 85852 9820 202320 722800 0 0 16 7648 967 2723 12 6 82 0 2 1 85852 9776 202320 722848 0 0 0 10556 1133 2943 9 9 83 0 1 1 85852 9696 202320 722928 0 0 12 10482 1115 3088 9 11 80 1 0 0 85852 9664 202320 723172 0 0 168 8902 1058 2998 9 9 82 0 1 1 85852 9608 202320 723228 0 0 26 9546 1035 2913 12 10 78 0 1 2 85852 9648 202320 723088 0 0 0 8008 977 1660 15 6 79 0 1 1 85852 9768 202320 722888 0 0 24 9012 869 1537 12 14 74 0 1 1 85852 9768 202320 723008 0 0 4 10470 1105 2496 9 11 80 0 1 1 85852 9660 202320 723144 0 0 12 8968 1017 2628 10 9 81 0 1 2 85852 9628 202320 723144 0 0 0 10312 1107 2530 9 10 81 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 1 2 85852 9608 202320 723184 0 0 28 8882 1022 2196 12 8 80 2 0 0 85852 9868 202320 723008 0 0 0 9120 989 2659 12 7 81 1 0 0 85852 9748 202320 723264 0 0 64 8546 900 1735 11 8 82 0 1 2 85852 9720 202340 723156 0 0 8 10342 961 2100 11 11 79 1 0 1 85852 9648 202340 723248 0 0 24 10216 1072 2297 8 10 83 1 1 2 85852 9644 202340 723328 0 0 8 9360 1091 2398 9 9 83 0 1 1 85852 9644 202340 723320 0 0 0 10364 1082 1785 8 13 79 0 1 1 85852 9676 202340 723208 0 0 4 10348 1074 2722 9 8 83 0 1 1 85852 9612 202340 723504 0 0 44 8740 981 2061 12 9 79 0 1 1 85852 9736 202340 723288 0 0 44 9688 970 2613 11 10 79 0 2 1 85852 9728 202340 723320 0 0 0 9546 1049 3063 9 9 82 0 1 3 85852 10012 202340 723156 0 0 20 10420 969 2422 10 10 79 0 1 1 85852 9880 202340 723328 0 0 38 8844 960 2457 8 9 83 0 1 1 85852 9744 202340 723464 0 0 12 8902 1121 3192 8 10 81 1 0 0 85852 9696 202340 723508 0 0 20 8580 1000 2818 13 9 78 0 1 2 85852 9596 202340 723328 0 0 50 10276 1020 2377 15 9 76 0 1 2 85852 9668 202340 723036 0 0 0 8934 1034 2306 10 7 83 0 1 1 85852 9660 202340 723060 0 0 12 8904 1038 2614 8 9 83 1 0 1 85852 9668 202340 723060 0 0 0 10376 1181 3440 9 10 81 1 0 0 85852 9664 202340 723108 0 0 0 9796 1009 2672 8 11 81 1 1 1 85852 9756 202328 722820 88 0 134 776 477 1208 6 2 93 0 1 1 85852 9664 202340 723124 0 0 0 10418 1169 3808 9 9 82 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 1 1 85852 9588 202340 723364 0 0 10 8022 945 1934 16 8 76 1 0 0 85852 9588 202340 723276 0 0 8 8948 923 1834 13 9 78 0 1 1 85852 9600 202340 723076 0 0 32 10178 904 2108 12 12 76 0 1 1 85852 9684 202340 723068 0 0 8 8818 1092 2136 9 8 83 0 1 1 85852 9700 202340 723132 0 0 12 10384 1111 3423 10 10 80 0 1 1 85852 9592 202340 723132 0 0 8 8196 946 2404 12 9 79 0 1 2 85852 9596 202340 722900 0 0 32 9444 1030 2739 9 8 83 0 1 2 85852 9608 202340 722724 0 0 0 10518 1079 2320 10 11 79 1 0 0 85852 9608 202340 722820 0 0 0 8798 1123 2883 9 8 84 0 1 1 85852 9608 202340 722884 0 0 8 11196 1106 3253 10 16 74 1 0 0 85852 9608 202340 722980 0 0 8 9590 1072 2282 8 14 78 0 1 1 85852 9608 202340 723068 0 0 4 10448 1144 2012 9 9 81 0 1 2 85852 9604 202340 723124 0 0 12 10410 1050 2593 8 11 82 0 1 1 85852 9828 202340 722984 0 0 0 9458 1056 2845 9 9 81 1 0 1 85852 9808 202340 723064 0 0 0 8282 1017 2730 11 8 80 0 1 1 85852 9660 202340 723048 0 0 8 8184 844 2003 18 8 74 2 0 0 85852 9592 202352 723004 238 0 514 2824 522 1158 36 8 55 2 1 2 84540 9988 202360 722428 52 0 270 3180 619 1843 50 9 42 0 1 2 84540 10080 202376 721740 0 0 332 5510 690 1619 28 7 65 1 0 0 84540 10080 202376 721740 0 0 0 3566 541 1367 29 4 67 0 1 1 84540 10096 202376 721724 0 0 0 8724 987 3108 12 9 79 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 1 1 84540 10096 202376 721724 0 0 0 7820 960 2736 12 7 81 1 0 0 84540 10100 202376 721724 0 0 0 7758 975 2757 13 8 79 0 1 2 84540 10100 202376 721724 0 0 0 8556 851 2026 17 8 74 0 1 1 84540 10100 202376 721724 0 0 0 5450 765 1950 20 4 76 0 1 1 84540 10112 202376 721724 0 0 0 6922 870 2438 19 7 75 1 0 0 84540 10112 202376 721724 0 0 0 5230 701 2023 24 5 71 0 1 1 84540 10116 202376 721724 0 0 0 5294 686 1933 25 5 70 0 1 2 84540 10116 202376 721724 0 0 0 5048 572 1734 29 6 65 0 1 1 84540 10116 202376 721724 0 0 0 4674 617 1548 27 4 69 1 0 0 84540 10120 202376 721724 0 0 0 4420 677 1818 25 6 69 1 0 0 84540 10120 202376 721724 0 0 0 3656 500 1309 31 4 65 1 0 0 84540 10124 202376 721724 0 0 0 3748 473 1246 32 4 64 1 0 0 84540 10124 202376 721724 0 0 0 1970 323 286 41 1 58
[11 Aug 2003 3:52]
Alexander Keremidarski
Reading carefully your comments conclusion is that you found that MySQL 4.0.14 works slower on RedHat 9 than on RedHat 7.3 and also 4.0.14 is slower on Windows XP (compared to what?) Is that true?
[12 Aug 2003 13:39]
Sergei Golubchik
salle until you finish these investigations let this bug be assigned to you, ok ?
[27 Aug 2003 7:53]
Michael Widenius
I don't have yet a solution to this problem, but some ideas that may help us find the problem. First some background: - If your application worked good in 3.23 it should work at least as good in the same environment in 4.0. - The way MyISAM does locking has not changed between 3.23 and 4.0. MyISAM still uses table locking and SELECT and INSERT can't be run at the same time (except in the case where you insert into a table without deleted rows). Several SELECTS can however be run at the same time. This is described in detail in http://www.mysql.com/doc/en/Table_locking.html - You can influence how MyISAM does locking by starting mysqld with: --low-priority-updates (in which case MySQL will prefer selects before inserts) or by using INSERT LOW_PRIORITY ... You can also schedule more SELECTS between inserts by setting the variable max_write_lock_count to a low value (for example 10). - If you need to be able to read and write at the same time, then Innodb tables should be the way to go. If you set up buffers for InnoDB you should get very good speed in most case! (If you don't set up buffers InnoDB tables will work, but very slow). - MyISAM doesn't work as good on windows as on Linux because Windows doesn't cache the disk as good as Linux (which is needed because MyISAM depends on the system to cache rows (not index). This could explain the test results you got on windows. Note that InnoDB doesn't have this problem on Windows and is know to deliver very good performance on this platform (if buffers are set up correctly). - The first insert in a MyISAM table is much faster than future inserts because when you do an insert into an empty tables we create the index in a separate batch. You can get future index faster by increasing the key_buffer_size, myisam_sort_buffer_size and bulk_insert_buffer_size. If you don't specify key_buffer_size at all, then your inserts will be significantly slower as your tables grow. - MySQL 4.0 has more caches and much more parameters to tune than 3.23. (The most important ones are record_rnd_buffer_size, bulk_insert_buffer_size and query_cache_size). If we could get your my.cnf and status variables we could give you tips how to get more speed for your queries. (Note that the problem could be that you have set up a high value for some cache in 3.23 but not for a new cache in 4.0 that does the same job) - If you are using a binary downloaded from www.mysql.com it should not be a glibc bug as our binaries are staticly compiled and thus doesn't use the glibc on RedHat 9. - According to vmstat there was a lot of swap activity when you said that mysqld started to go slow. Usually this indicates some very strange query or that another (maybe sleeping) program started. Here is what you can try next: - Can you post the output from 'mysqladmin var ext' when the slowdown happens? - What options do use to start mysqld ? Please post your my.cnf files here. You can also get this by doing: my_print_defaults mysqld server mysqld_safe safe_mysqld - When you tried with innodb, what settings for the innodb buffers did you use. The default settings are very low and not suitable for heavy load with large tables. Check settings in my-large.cnf for an idea how to set them. You should also try to set innodb_flush_log_at_trx_commit to 0 for even higher speed. - Did you try to run MySQL 3.23 on the new redhat 9.0 machine ? Does this work ok ? - You said that the time went up to 4 minutes. Can you please describe in more detail what you meant ? Can you make a test for this ? - Can you create a better test script that also gets the old 7.3 machine to fail (now when it's upgraded to 9.0) - Try a kernel from Suse 8.2; They have compiled their kernels with totally different options than Redhat. If this works, then we know a lot more. - I talked with the Redhat people on Linuxworld and they told me that their default kernels are optimized for desktop usage. If you want to have a (pre-compiled ?) kernel that is suitable for database usage you MUST buy RedHat advanced server. They didn't tell me exactly what was changed except that at least the memory system was significantly different in advanced server. I wonder if they also changed how the file system was cached (which could explain a lot of the things you are seeing) Could you try if upgrading to RedHat advanced server would help ? - Have you tested converting your tables to InnoDB (after setting up the innodb buffers to match your system). If yes, what changed and what performance numbers did you get ? The main reason for trying innodb is that innodb has it's own disk cache system and is not dependent on the system to cache data. If you are mainly using innodb tables you should set up the innodb buffers to use 80 % of the free memory you have on your machine or enough memory to cache a big part of your tables. - Try moving swap and /tmp to another hardisk. - Check that you have the same dma settings on both machines on your hard disks. (you can do this with hdparm) - Check that you don't have any program that uses a lot of disk (like updatedb) when the slowdown happens! Some ideas of where the problem could be: - Bug in kernel. - Kernel optimized for 'workstation' which is not suitable for live data processing with uses lots of memory. - Some hardware problem on your 'faster' machine. - Some cache option in MySQL 4.0 that is not optimal - Some bug in MySQL 4.0 (but not that likely as MySQL 4.0 works ok your other machine) - Other program accessing disk or taking up resources. We are doing our best to try to repeat this problem. To do this we would however need a way to reproduce this. I am now going to look at the problem Alexander Keremidarski reported in the bug report and try to reproduce this. I will also try today to run your tests script on my Redhat 9.0 machine to check things out, but I am worrying that I may not be able to reproduce this without having access to the startup options you used for MySQL. Another option would be if we could get access to your 'new system' with some scripts that we could use to reproduce this. Other information: According to what I have been informed after writing the above, there is a follow up to this bug in #1027. There has also been some discussion on the bugs@lists.mysql.com email list about this. - The config file post in #1027 doesn't have any buffers setup and explains why MySQL doesn't work on full speed. (It doesn't however explain why the 9.0 machine was much slower) - For not it looks like a bug in Redhat 9.0. We are of course still interested to fix this (any bugs that causes MySQL to performance bad is interesting to us). We would however need a good test case to find out what could be the problem. One thing we could do is to put RedHat 9 on one on our benchmark machine and run our benchmarks on it to see if we can find any notable difference between this and Suse 8.2. I will instruct our benchmark team to do this ASAP. It would be very interesting to know if RedHat Advanced server also have the same problem. Regards, Monty
[28 Aug 2003 16:53]
Paul Coldrey
Hi Monty, Firstly, thanks for all your comments :-) From my point of view this bug could be moved into "cannot repeat" status. However, Alex may have a different opinion?? I do not feel that my test case properly illustrates the issue I am experiencing (sorry). My problems may be related to the issues that Alex saw, but I am not convinced this is the case (and, in fact, I strongly suspect it is not the case). Here are some related comments from an email that I sent directly to Alex when we were discussing the fate of bug 1027... > The innodb_flush_log_at_trx_commit does have an effect on our systems, as > you suggested. However, the server that all this started on, still seems to > have inexplicably bad performance regardless of this setting. Perhaps this > is tied to the issues you have found 914 in some way. > > Also, as you noted, some of the information I provided on 914 is probably > spurious because there were so many effects I was struggling with, that it > was hard to get a clear picture. Some spots where I was confused are: > > 1. I did not realise MyISAM tables blocked SELECTs when INSERTs occur - > previously they had always occured so quickly that I had not been effected > by this behaviour. Hence I reported this as a bug when it was actually just > an artefact of the really poor performance I was experiencing > > 2. To try and overcome the blocking issue I moved the tables that require > bulk inserts in InnoDB. I did not set the innodb_flush_log_at_trx_commit > flag and so I noticed hugely disparate performance between my old and new > server - some of which was clearly due to this flag. > > 3. The test on my Windows XP machine runs correctly now so I think that was > a random Microsoft'ism that occured at the end of a long day :-) Notwithstanding all above notes, there is still definitely a problem with my machine. I have a routine I can run that performs REALLY poorly,.. the problem lies in isolating the key feature of the algorithm which is running slowly, in isolation all the constituent parts seem to run OK. Initially, I did not realise how subtle my problem was as it is quite easy to repeat in the environment in which I found it. Recently, I have been trying to code around the problem and I believe I may have isolated a repeatable test case which shows some odd behaviour (related to my initial problems). However, I believe it is unrelated to the issues uncovered by Alex in this report and so is not relevant to this bug. I have been considering changing the server over to FreeBSD and if I follow this path then I will let you guys know whether my problems still occur. In light of your comments on RedHat, it seems this may be a wise step on any account. Cheers, Paul
[28 Aug 2003 17:07]
Paul Coldrey
my.cnf file: [mysqld] ft_min_word_len=2
[18 Dec 2003 21:53]
MySQL Verification Team
Are you tested the latest 4.0.X release (now we are with 4.0.17) after your experience with 4.0.14 ? or some relevant news about hardware/OS experiences?
[21 Dec 2003 14:36]
Paul Coldrey
I remain unable to make a repeatable test case. I found two machines on which I could repeat the problem (both running RedHat 9 with software RAID). I have found other machines (running a miscellany of operating systems - including one with RH9 and software RAID) on which the performance was notably better. As noted in one of the reports it appears as if "high-power" server machines run slower than "low-power" desktops (and I feel this looks more like a RedHat issue than a MySQL one). I know this is not much help to tracking down the problem! In the end, I moved to InnoDB tables and changed the whole algorithm to get around the problem.... I have not run into any other speed issues. I would not object if this bug were closed. Thanks for all your help and comments.
[23 Jan 2004 19:35]
MySQL Verification Team
According your last comment that wasn't possible to have a test case and won't possible because you moved to InnoDB. Thank you.