Bug #38740 DBT3 Q13 query slow down so much in a concurrency environment when using Falcon
Submitted: 12 Aug 2008 8:16 Modified: 26 May 2010 17:50
Reporter: Xuekun Hu Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S5 (Performance)
Version:6.0-falcon, rev. 2757 OS:Linux (SLES10SP1 (2.6.16.46-0.12-smp))
Assigned to: Kelly Long CPU Architecture:Any
Tags: DBT3, F_PERFORMANCE

[12 Aug 2008 8:16] Xuekun Hu
Description:
I used DBT3 workload to test MySQL/Falcon performance on DP Quad-core platform. With 50 threads running Q13 simultaneously, the average execution time for Q13 is 360 seconds, compared with 6 seconds for only one thread running. Q13 will slow down 60x! 

User time: 28.76%; while Kernel time: 71.05%.

Oprofile data shows do_futex is No.1 in kernel. 

Cycles%	Module	Function
40.21	vmlinux-2.6.16.46-0.12-smp	do_futex
9.52	vmlinux-2.6.16.46-0.12-smp	__up_read
8.50	vmlinux-2.6.16.46-0.12-smp	__down_read
2.29	vmlinux-2.6.16.46-0.12-smp	schedule
2.13	libc-2.4.so	memset
2.11	vmlinux-2.6.16.46-0.12-smp	drop_key_refs
1.95	libpthread-2.4.so	__pthread_mutex_unlock_usercnt
1.72	vmlinux-2.6.16.46-0.12-smp	try_to_wake_up
1.68	libpthread-2.4.so	pthread_mutex_lock
1.46	mysqld	SyncObject::lock(Sync*, LockType, int)
1.40	mysqld	MemMgr::alloc(int)
1.26	libpthread-2.4.so	__lll_mutex_lock_wait
1.23	mysqld	Table::fetch(int)
1.13	mysqld	MemMgr::releaseBlock(MemBlock*)
1.06	mysqld	Bitmap::nextSet(int)
0.80	mysqld	Sync::lock(LockType)
0.77	mysqld	Bitmap::set(int)
…	…	…

GDB data shows mostly threads are waiting at: 

Thread 39 (Thread 1170053440 (LWP 5889)):	
#0  0x00002b847b7b7548 in __lll_mutex_lock_wait () from /lib64/libpthread.so.0	
#1  0x00002b847b7ba290 in __gxx_personality_v0 () from /lib64/libpthread.so.0	
#2  0x00002aaac3f07bf8 in ?? ()	
#3  0x00002b847b7b389b in __gxx_personality_v0 () from /lib64/libpthread.so.0	
#4  0x0000000000aa19d0 in vtable for Value ()	
#5  0x00002aac00000000 in ?? ()	
#6  0x00002aac045fdd90 in ?? ()	
#7  0x00002b847b7ba290 in __gxx_personality_v0 () from /lib64/libpthread.so.0	
#8  0x0000000000aa19d0 in vtable for Value ()	
#9  0x00002b8400000000 in ?? ()	
#10 0x00002b847b7ba588 in __PRETTY_FUNCTION__.6075 () from /lib64/libpthread.so.0	
#11 0x0000000000000001 in ?? ()	
#12 0x00002b847b7b4789 in __gxx_personality_v0 () from /lib64/libpthread.so.0	
#13 0x00002aac00000000 in ?? ()	
#14 0x00000000011fe530 in ?? ()	
#15 0x00002b847beaaa90 in memmove () from /lib64/libc.so.6	
#16 0x0000000000dedc68 in memoryManager ()	
#17 0x0000000045bd65c0 in ?? ()	
#18 0x0000000000789af5 in Mutex::lock ()	
#19 0x0000000000797341 in Sync::lock ()	
#20 0x000000000078910d in MemMgr::releaseBlock ()	
#21 0x00000000007b3d8a in Bitmap::deleteVector ()	
#22 0x00000000007b3d25 in Bitmap::deleteVector ()	
#23 0x00000000007b3dd5 in Bitmap::clear ()	
#24 0x00000000007e032f in Index::scanIndex ()	
#25 0x0000000000792a02 in StorageTable::indexScan ()	
#26 0x0000000000782d32 in StorageInterface::index_read ()	
#27 0x0000000000678305 in join_read_always_key ()	
#28 0x000000000066a497 in sub_select ()	
#29 0x000000000066a2ac in evaluate_join_record ()	
#30 0x000000000066a453 in sub_select ()	
#31 0x000000000066adfd in do_select ()	
#32 0x0000000000681e06 in JOIN::exec ()	
#33 0x0000000000683dc0 in mysql_select ()	
#34 0x000000000073b2a7 in mysql_derived_filling ()	
#35 0x000000000073b128 in mysql_handle_derived ()	
#36 0x0000000000650e75 in open_and_lock_tables_derived ()	
#37 0x000000000060a0f5 in execute_sqlcom_select ()	
#38 0x00000000006107ca in mysql_execute_command ()	
#39 0x0000000000614e84 in mysql_parse ()	
#40 0x000000000061577d in dispatch_command ()	
#41 0x0000000000616247 in do_command ()	
#42 0x0000000000607744 in handle_one_connection ()	
#43 0x00002b847b7b1143 in start_thread () from /lib64/libpthread.so.0	
#44 0x00002b847bef674d in clone () from /lib64/libc.so.6	
#45 0x0000000000000000 in ?? ()	

and 

Thread 38 (Thread 1170319680 (LWP 5890)):
#0  0x00002b847b7b7548 in __lll_mutex_lock_wait () from /lib64/libpthread.so.0
#1  0x00002b847b7ba290 in __gxx_personality_v0 () from /lib64/libpthread.so.0
#2  0x0000000000bc614e in __gxx_personality_v0 ()
#3  0x00002b847b7b389b in __gxx_personality_v0 () from /lib64/libpthread.so.0
#4  0x00002aac0462c700 in ?? ()
#5  0x00002aac04557200 in ?? ()
#6  0x0000000100000000 in ?? ()
#7  0x00002aaac3e86780 in ?? ()
#8  0x00000000ffffffff in ?? ()
#9  0x00000000ffffffff in ?? ()
#10 0x00002b847b7ba588 in __PRETTY_FUNCTION__.6075 () from /lib64/libpthread.so.0
#11 0x0000000000000001 in ?? ()
#12 0x00002b847b7b4789 in __gxx_personality_v0 () from /lib64/libpthread.so.0
#13 0x00002aabd34f303c in ?? ()
#14 0x0000000000000005 in ?? ()
#15 0x00002aabdded2b30 in ?? ()
#16 0x00002aac0462c6e0 in ?? ()
#17 0x0000000000dedbc8 in memoryManager ()
#18 0x00000000ffffffff in ?? ()
#19 0x0000000045c14a90 in ?? ()
#20 0x00002aaac3ebc840 in ?? ()
#21 0x0000000000dedc68 in memoryManager ()
#22 0x0000000045c14a90 in ?? ()
#23 0x0000000000789af5 in Mutex::lock ()
#24 0x0000000000797341 in Sync::lock ()
#25 0x0000000000788596 in MemMgr::alloc ()
#26 0x00000000007893cb in MemMgrAllocate ()
#27 0x00000000007b46a9 in Bitmap::set ()
#28 0x00000000007e605d in IndexRootPage::scanIndex ()
#29 0x00000000007e051b in Index::scanIndex ()
#30 0x0000000000792a02 in StorageTable::indexScan ()
#31 0x0000000000782d32 in StorageInterface::index_read ()
#32 0x0000000000678305 in join_read_always_key ()
#33 0x000000000066a497 in sub_select ()
#34 0x000000000066a2ac in evaluate_join_record ()
#35 0x000000000066a453 in sub_select ()
#36 0x000000000066adfd in do_select ()
#37 0x0000000000681e06 in JOIN::exec ()
#38 0x0000000000683dc0 in mysql_select ()
#39 0x000000000073b2a7 in mysql_derived_filling ()
#40 0x000000000073b128 in mysql_handle_derived ()
#41 0x0000000000650e75 in open_and_lock_tables_derived ()
#42 0x000000000060a0f5 in execute_sqlcom_select ()
#43 0x00000000006107ca in mysql_execute_command ()
#44 0x0000000000614e84 in mysql_parse ()
#45 0x000000000061577d in dispatch_command ()
#46 0x0000000000616247 in do_command ()
#47 0x0000000000607744 in handle_one_connection ()
#48 0x00002b847b7b1143 in start_thread () from /lib64/libpthread.so.0
#49 0x00002b847bef674d in clone () from /lib64/libc.so.6
#50 0x0000000000000000 in ?? ()

How to repeat:
Running DBT3 Q13 query with 50 threads running. :-) 

Suggested fix:
I analyzed that it is due to heavy contention in MemMgr manager buffer, coming from the allocation and release from BitMap.cpp when handling the inner loops of joins.
[20 Aug 2008 23:02] Kevin Lewis
This performance enhancement will be assigned to Kelly Long to determine ways to tweak the Falcon memory manager into allowing better concurrency.  Doint thread local memory allocations and dividing the memory pools into more than just the current 'small' and 'large' allocations are possibilities.

If this problem is mostly limited to Bitmaps, then a local solution can be made to the Bitmap class.  It could allocate a larger piece of memory and subdivide it, using portions as needed.  Each Bitmap object can keep a short list of free sparse map blocks this way.  Other classes already do this.  See DeferredIndex::alloc() as an example.
[29 Aug 2008 2:35] Kevin Lewis
Setting to verified due the extensive and well documented testing by the China Intel team.