| Bug #37424 | Considerable performance degradation in Falcon from 6.0.4 to 6.0.5 | ||
|---|---|---|---|
| Submitted: | 16 Jun 2008 8:20 | Modified: | 26 May 2010 17:50 |
| Reporter: | Philip Stoev | Email Updates: | |
| Status: | Unsupported | Impact on me: | |
| Category: | MySQL Server: Falcon storage engine | Severity: | S2 (Serious) |
| Version: | 6.0.5 | OS: | Any |
| Assigned to: | Kelly Long | CPU Architecture: | Any |
| Tags: | F_PERFORMANCE | ||
[16 Jun 2008 12:26]
Philip Stoev
720-second sysbench test with 10 users, falcon_lock_wait_timeout=0: build-team 6.0.4 debug read/write requests: 3570408 (4958.49 per sec.) build-team 6.0.4 nondebug read/write requests: 3540522 (4917.05 per sec.) build-team 6.0.5 debug read/write requests: 688736 (956.21 per sec.) build-team 6.0.5 nondebug read/write requests: 714368 (991.55 per sec.) self-compiled 6.0.5 debug read/write requests: 690032 (957.96 per sec.) self-compiled 6.0.5 nondebug read/write requests: 681630 (946.31 per sec.) self-compiled 6.0-falcon nondbg read/write requests: 713676 (990.77 per sec.) The -03 flag appears to kick in properly for all nondebug builds.
[19 Jun 2008 7:49]
Mats Kindahl
From looking at the description of the bug, there are a few things that
I find strange (and hints that it is not a compiler problem).
- The two functions SyncObject::lock() and ::unlock() have risen
significantly. Now, I don't know what the sync object does, but I
assume that it is some form of mutex. The fact that they have risen,
hints at some kind of congestion on the locks, which can be caused by
a number of factors, including at least:
- Wrong phase of the moon. Just bad luck with this run, is it
repeatable? Given enough time, even unlikely events happen.
- Competing for memory or disk. Now, it is hard to tell since I don't
know if the figures are real time, or CPU time. It would be nice to
have the real time, but also the user time and system time since
that helps identify if this is an OS issue (e.g., thrashing on
disk) or other issues.
- Bad use of locks that prevents the system from leaving the
pathological state once it's there.
- The ::addRef() and ::lock() and ::unlock() has risen together. Since I
cannot envision that the addRef() does anything complicated that can
cause an increase in CPU usage, I draw the conclusion that the the
figures are based on the real time, not CPU time.
This hints at some form of memory (or disk) convoying/caravaning where
the same memory is being accessed by several threads.
The relatively small increase suggests that it is memory that is the
accessed resource, not disk.
- The fact that some record manipulation functions has entered the arena
hints at memory (or disk) convoying/caravaning as well.
- This is not related to the problem at hand, but what on earth are the
Bitmap manipulation functions doing in the list? They should be
inlined because they are usually small and executed often. Are the
definitions available in the header files so that the compiler has a
chance to inline them?
- Also note that the index scanning function have appeared at the list.
If it is a compiler problem, you are likely to see a repeatable
degradation. Compilers are pretty predictable in this aspect: they will
follow the same rules when generating the code, so there is no
randomness involved. Considering that you have supplied -O3, I would
rule out code generation issues as a performance degradation of this level.
[1 Jul 2008 19:16]
Kelly Long
---- summary -----
I did some comparisons on this to find the changeset that caused the slow down. Vlad will look at this next. The change set causing this slow down:
revno: 2597.22.1
revision-id: sp1r-vvaintroub@wva.-20080328064007-14330
parent: sp1r-hakank/hakan@lu0011.(none)-20080322192653-14335
committer: vvaintroub@wva.
timestamp: Fri 2008-03-28 07:40:07 +0100
message:
WL#3763: Supernodes
- If insert is done in decreasing order of values, no supernodes were
created. Now fixed - if index node is inserted at the start
of the page, next node can become supernode, if the gap between the
start of the page and first supernode is too large.
- Allow supernode on duplicate keys.
- Do not always delete supernode if previous node is deleted, but check
whether saving it makes sense.
---- detail -----
Below you'll find a list of revision ids followed by one of the results from the sysbench test. All results are similar to the one I choose to list. For the revisions I tested there is a line below the revision with the results. There is also a degradation issue as the test continues. You can see this in the locations I ran the test 3 times witout restarting the database. The results below a "no supernodes" comment are a slightly modifed code base where I turned off the supernodes (always return false in the checkAddSupernode() routine).
cd mysql-6.0
bzr log --show-ids | egrep 'revision-id:|tags:' > /tmp/bzr-6.0-logs-show-ids.2
revision-id: sp1r-cbell/Chuck@mysql_cab_desk.-20080411171935-14416
# transactions: 31029 (43.08 per sec.)
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080410190122-10487
--- cut ---
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080330122626-54363
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080330095349-54376
# transactions: 30751 (42.69 per sec.)
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080330094706-53538
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080330090700-53665
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080330095259-57282
revision-id: sp1r-cpowers@xeno.mysql.com-20080330072429-57288
revision-id: sp1r-cpowers@xeno.mysql.com-20080329182218-57524
revision-id: sp1r-cpowers@xeno.mysql.com-20080328232654-57519
revision-id: sp1r-cpowers@xeno.mysql.com-20080328224644-57519
revision-id: sp1r-cpowers@xeno.mysql.com-20080328224436-22681
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080328201039-56444
# transactions: 96533 (134.06 per sec.)
# transactions: 61482 (85.39 per sec.)
# transactions: 45112 (62.63 per sec.)
###### no supernodes
# transactions: 128358 (178.27 per sec.)
# transactions: 128162 (178.00 per sec.)
# transactions: 129308 (179.59 per sec.)
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080328183802-56447
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080328182226-61590
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080328175251-56251
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080328174917-56243
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080328171351-61576
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080328164232-03126
revision-id: sp1r-klewis@klewis-mysql.-20080328153527-21606
# transactions: 78300 (108.74 per sec.)
# transactions: 41569 (57.73 per sec.)
# transactions: 35202 (48.88 per sec.)
###### no supernodes 20080328153527
# transactions: 128828 (178.92 per sec.)
# transactions: 128891 (179.01 per sec.)
# transactions: 129209 (179.44 per sec.)
revision-id: sp1r-vvaintroub@wva.-20080328065431-25981
revision-id: sp1r-vvaintroub@wva.-20080328064007-14330
revision-id: sp1r-klewis@klewis-mysql.-20080328042148-22090
# transactions: 142006 (197.21 per sec.)
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080327203449-26939
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080327114539-60369
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080326193327-64165
--- cut ---
revision-id: sp1r-hakank/hakan@lu0011.(none)-20080313141037-15204
revision-id: sp1r-cbell/Chuck@mysql_cab_desk.-20080407153214-28468
# transactions: 152292 (211.50 per sec.)
[21 Jul 2008 15:12]
Philip Stoev
Here are some more results with a freshly cloned mysql-6.0-falcon:
no supernodes:
transactions: 18326 (61.04 per sec.)
deadlocks: 7969 (26.54 per sec.)
read/write requests: 475194 (1582.86 per sec.)
other operations: 44621 (148.63 per sec.)
with supernodes:
transactions: 18259 (60.81 per sec.)
deadlocks: 8123 (27.05 per sec.)
read/write requests: 476286 (1586.17 per sec.)
other operations: 44641 (148.67 per sec.)
Notice there is no difference in the results and that they are somewhere between the "good" and the "bad" values reported initially in this report. All tests were done on the same machine.
[21 Jul 2008 17:07]
Kevin Lewis
Philip, Please try --falcon_user_supernodes=0 instead of --falcon_user_supernodes=off.
[21 Jul 2008 17:27]
Philip Stoev
Result with --falcon_use_supernodes=0 --mysqld=--skip-falcon-use-supernodes:
transactions: 18390 (61.26 per sec.)
deadlocks: 7997 (26.64 per sec.)
read/write requests: 476912 (1588.58 per sec.)
other operations: 44777 (149.15 per sec.)
[21 Jul 2008 17:34]
Philip Stoev
Same results were observed when running with just --falcon_use_supernodes=0
[21 Jul 2008 17:38]
Kevin Lewis
Kelly, Please take another look.
[21 Jul 2008 17:57]
Philip Stoev
A 720 second test with --falcon_use_supernodes=0 (previous tests were 600-second):
transactions: 28323 (39.32 per sec.)
deadlocks: 12303 (17.08 per sec.)
read/write requests: 734253 (1019.45 per sec.)
other operations: 68949 (95.73 per sec.)
[14 Nov 2008 11:19]
Philip Stoev
Sysbench "classic" 0.4.8 was used for this bug, as downloadable from http://sourceforge.net/projects/sysbench , test name OLTP.
[14 Nov 2008 12:18]
Philip Stoev
Command lines: ./mysql-test-run.pl \ --start-and-exit \ --mysqld=--log-output=file \ --mysqld-falcon_lock_wait_timeout=0 ./sysbench \ --test=oltp \ --mysql-host=localhost\ --mysql-port=9306 \ --mysql-user=root \ --mysql-db=test \ --mysql-table-engine=Falcon \ --mysql-engine-trx=yes \ --oltp-table-size=100000 \ prepare ./sysbench \ --test=oltp \ --mysql-host=localhost \ --mysql-port=9306 \ --mysql-user=root \ --mysql-db=test \ --mysql-table-engine=Falcon \ --mysql-engine-trx=yes \ --max-requests=0 \ --max-time=720 \ --num-threads=10 \ run

Description: When running sysbench on 6.0.5, the numbers are a considerably smaller than they are for 6.0.4. This is on a 2-core machine, debug binary, falcon_lock_wait_timeout=0, 720-second test with 10 threads: 6.0.4: transactions: 67852 (94.23 per sec.) deadlocks: 32844 (45.61 per sec.) read/write requests: 1815620 (2521.53 per sec.) 6.0.5: transactions: 28845 (40.05 per sec.) deadlocks: 13293 (18.46 per sec.) read/write requests: 760609 (1055.98 per sec.) Innodb does not exhibit this regression. Both debug and non-debug binaries were tested on both 2-core and 8-core machines. Both 10 and 100 threads were tested and both 10-minute and 3-hour tests were performed with various table sizes. Oprofile shows the following as being the most used functions: For 6.0.4: samples % image name symbol name 509411 41.4194 no-vmlinux (no symbols) 46598 3.7888 mysqld Record::addRef() 36387 2.9586 mysqld my_hash_sort_simple 30671 2.4938 mysqld SyncObject::lock(Sync*, LockType, int) 19435 1.5802 mysqld SyncObject::unlock(Sync*, LockType) 16090 1.3083 mysqld Table::fetch(int) 16077 1.3072 libc-2.8.so malloc 15162 1.2328 mysqld _db_enter_ 14907 1.2121 mysqld IndexPage::findNodeInLeaf(IndexKey*, IndexKey*) 14682 1.1938 libpthread-2.8.so pthread_getspecific 14212 1.1556 mysqld Bitmap::nextSet(int) 14200 1.1546 mysqld code_state For 6.0.5: samples % image name symbol name 338705 26.1383 no-vmlinux (no symbols) 127634 9.8497 mysqld Record::addRef() 113885 8.7887 mysqld SyncObject::lock(Sync*, LockType, int) 84493 6.5204 mysqld SyncObject::unlock(Sync*, LockType) 51897 4.0050 mysqld Table::fetch(int) 40054 3.0910 mysqld RecordVersion::fetchVersion(Transaction*) 38903 3.0022 mysqld Bitmap::nextSet(int) 30008 2.3158 mysqld IndexRootPage::scanIndex(Dbb*, int, int, IndexKey*, IndexKey*, int, unsigned int, Bitmap*) 28264 2.1812 mysqld Record::release() 26865 2.0732 mysqld Bitmap::set(int) 26419 2.0388 mysqld Record::poke() 25459 1.9647 mysqld Sync::lock(LockType) 22700 1.7518 mysqld StorageDatabase::nextIndexed(StorageTable*, void*, int, bool) 14938 1.1528 mysqld my_hash_sort_simple 14269 1.1012 libpthread-2.8.so pthread_mutex_lock 14268 1.1011 mysqld RecordLeaf::fetch(int) 11725 0.9048 [vdso] (tgid:26275 range:0x110000-0x111000) (no symbols) 11605 0.8956 libpthread-2.8.so __pthread_mutex_unlock_usercnt 11138 0.8595 mysqld Sync::Sync(SynchronizationObject*, char const*) 10459 0.8071 mysqld Bitmap::decompose(int, unsigned int*) 8485 0.6548 mysqld Sync::~Sync() 8352 0.6445 libc-2.8.so malloc 8267 0.6380 mysqld code_state Please note how many Falcon functions have risen to the top of the report at the expense of non-Falcon functions, such as my_hash_sort_simple, malloc, and code_state. How to repeat: Run sysbench against 6.0.5 Suggested fix: It seems to me there is a compilation issue.