Bug #56585 Slowdown of readonly sysbench benchmarks (e.g point_select) on Windows 5.5
Submitted: 6 Sep 2010 9:38 Modified: 1 Dec 2010 17:32
Reporter: Vladislav Vaintroub Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Windows Severity:S5 (Performance)
Version:5.5.5 OS:Microsoft Windows
Assigned to: Vladislav Vaintroub
Triage: Triaged: D2 (Serious)

[6 Sep 2010 9:38] Vladislav Vaintroub
Description:
Performance of some readonly (notably point_select) queries went down on 5.5. See attached PDF for the graphs. Profiling shows that #1 reason for context switches is the newly introduced rwlock in MDL.

How to repeat:
Use sysbench  oltp readonly point_select test  to measure 5.5 against 5.1

Suggested fix:
improve performance of rwlock (the one that is used currently is  donw with portable yet inefficient in case of Windows implementation).
[6 Sep 2010 11:07] Vladislav Vaintroub
benchmark results

Attachment: 5.1.51.vs.5.5.5.vs.5.5.6.Windows-1.pdf (application/pdf, text), 463.45 KiB.

[6 Sep 2010 12:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/117600

3191 Vladislav Vaintroub	2010-09-06
      Bug#56585: Slowdown of readonly sysbench benchmarks (e.g point_select) 
      on Windows  5.5
      
      Problem:
      The reason for the bug, as identified using xperf profiler is heavy 
      context switching in reader-writer lock implementation. The (portable 
      but slow) implementation of rwlock on makes use of 2 condition variables
      (non-native, implemented with events and critical section) and a mysql 
      "mutex" (critical section on Wndows). This amounts to a heavy-weight object
      consisting of 6 Windows events and 3 critical sections. Profiling shows 
      the #1 context  switching when runnng sysbench readonly tests is 
      WaitForSingleObject/SetEvent on events inside rwlock inside MDL.
      
      Solution is to use native reader/writer locks aka slim rwlock on Windows,
      that are available since Vista, that are very fast. A set of new rwlock 
      related functions (my_win_rwlock_xxx) is implemented in this patch, that 
      will use slim reader-writer locks when available (Vista+) and fallback to
      old implementation if not available. This functions will be used for MySQL 
      reader-writer lock and also for the special pr("prefer reader") rwlock, that
      is used in MDL.
      
      How "prefer reader" functionality is implemented:
      We track count of pending reader requests.  Write lock is taken as usual,
      but if it finds that there are pending readers, it gives up the lock, yields
      and retries.
      
      This fulfills the requirement for MDL "prefer reader": write lock must not
      be given if there are pending readers.
      
      Benchmark on 8 core machine (sysbench, point_select, 256 users) shows 
      improvement by ~30% with this patch(18000 TPS with patch vs 14000 TPS 
      without)
[6 Sep 2010 21:10] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/117647

3193 Vladislav Vaintroub	2010-09-06
      Bug#56585: Slowdown of readonly sysbench benchmarks (e.g
      point_select) 
      
      This is a test patch , based on discussions with Dmitry: implement 
      condition variable on top of Vista native conditions, on Vista and later.
      
      This way, existing code for "reader prefered" locks can be reused,
      accounting for all prefer-readefr specifics (recursive locks etc)
      while overhead of using OS events is reduced.
      
      However, judging by benchmarks that compare native rwlock 
      to the one implemented on top of native condition to the one
      implemented on top of events, the gain of using native condition
      is minimal,while the gain using native rwlocks would be big.
      
      Following table illustrates the results of  benchmarks
      (sysbench, 2 readonly tests- simple ranges and point_select
      256 users, 8 core machine)
      
      benchmark     |  no-fix |  native rwlock |  native condition
      --------------------------------------------------------------------
      simple_ranges| 14051   | 17996           | 13906               |
      -------------------------------------------------------------------
      point_select   | 20576   | 29120           | 22961               |
      -------------------------------------------------------------------
[7 Sep 2010 12:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/117703

3194 Vladislav Vaintroub	2010-09-07
      Bug#56585 : Slowdown on Windows in readonly  benchmark.
      
      Yet another take on this problem:
      implement rwlock using 2 critical sections, atomic variable and event.
      This implementation is cheaper than  general one on based on condition 
      variables,because it has less overhead compared to general one
      
      For writer-mostly scenarios, the lock is equivalent to critical section,
      with just Enter/LeaveCriticalSection per rwlock/rwunlock pair.
      
      For concurent reader scenarios, it is again Enter/LeaveCriticalSection
      plus atomic increment and decrement  readers, except first-time lock
      and last-time unlock (first lock  blocks the writers via event, last unlock 
      unblocks writers, so for there is a bit more overhead).
      
      Compared to implementation on other platforms:
      1) reader and writer critical sections are split (so there is less 
        contention)
      2) Less critical section lock/unlock operations generally.
      
      Performance-wise this lock seems to bring a lot, outperforming
      even native reader-writer locks, as in the table below, under
      column ("no fix" is 5.5,  "srwlock" is native Vista RW lock, thislock
      is implementation in this patch)
      
      -------------------------------------------------
      benchmark      | no fix     srwlock  thislock  
      -----------------------------------------------
      simple_ranges | 14051    17996   18459
      -----------------------------------------------
      point_select    | 20576     29120   29790
      
      
      Besides,this lock implements correctly (in Dmitri's sense of it)
      "prefer readers", e.g writer does not enter or block newly incoming
      readers and waits until count of all (pending and active readers goes
      down to 0).
      
      Also, this lock is recusrive, for both readers and writers.
[7 Sep 2010 17:15] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/117720

3194 Vladislav Vaintroub	2010-09-07
      Bug#56585 : Slowdown on Windows in readonly  benchmark.
      (Added fix for race condition while seting/resetting event
      by last existing and first entering readers, spotted by Dmitry)
      
      Yet another take on this problem:
      implement rwlock using 2 critical sections, atomic variable and event.
      This implementation is cheaper than  general one on based on condition 
      variables,because it has less overhead compared to general one
      
      For writer-mostly scenarios, the lock is equivalent to critical section,
      with just Enter/LeaveCriticalSection per rwlock/rwunlock pair.
      
      For concurent reader scenarios, it is again Enter/LeaveCriticalSection
      plus atomic increment and decrement  readers, except first-time lock
      and last-time unlock (first lock  blocks the writers via event, last unlock 
      unblocks writers, so for there is a bit more overhead).
      
      Compared to implementation on other platforms:
      1) reader and writer critical sections are split (so there is less 
        contention)
      2) Less critical section lock/unlock operations generally.
      
      Performance-wise this lock seems to bring a lot, outperforming
      even native reader-writer locks, as in the table below, under
      column ("no fix" is 5.5,  "srwlock" is native Vista RW lock, thislock
      is implementation in this patch)
      
      -------------------------------------------------
      benchmark      | no fix     srwlock  thislock  
      -----------------------------------------------
      simple_ranges | 14051    17996   18459
      -----------------------------------------------
      point_select    | 20576     29120   29309
      
      
      Besides,this lock implements correctly (in Dmitri's sense of it)
      "prefer readers", e.g writer does not enter or block newly incoming
      readers and waits until count of all (pending and active readers goes
      down to 0).
      
      Also, this lock is recusrive, for both readers and writers.
[11 Sep 2010 19:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/118037

3207 Vladislav Vaintroub	2010-09-11
      Bug#56585 : Slowdown on Windows in readonly  benchmark.
      (Added fix for race condition while seting/resetting event
      by last existing and first entering readers, spotted by Dmitry)
            
      Yet another take on this problem:
      implement rwlock using 2 critical sections, atomic variable and event.
      This implementation is cheaper than  general one on based on condition 
      variables,because it has less overhead compared to general one
            
      For writer-mostly scenarios, the lock is equivalent to critical section,
      with just Enter/LeaveCriticalSection per rwlock/rwunlock pair.
            
      For concurent reader scenarios, it is again Enter/LeaveCriticalSection
      plus atomic increment and decrement  readers, except first-time lock
      and last-time unlock (first lock  blocks the writers via event, last unlock 
      unblocks writers, so for there is a bit more overhead).
            
      Compared to implementation on other platforms:
      1) reader and writer critical sections are split (so there is less 
           contention)
      2) Less critical section lock/unlock operations generally.
            
      Performance-wise this lock seems to bring a lot, outperforming
      even native reader-writer locks, as in the table below, under
      column ("no fix" is 5.5,  "srwlock" is native Vista RW lock, thislock
       is implementation in this patch)
            
      -------------------------------------------------
       benchmark      | no fix     srwlock  thislock  
       -----------------------------------------------
      simple_ranges | 14051    17996   18459
       -----------------------------------------------
       point_select    | 20576     29120   29309
            
            
      Besides,this lock implements correctly (in Dmitri's sense of it)
      "prefer readers", e.g writer does not enter or block newly incoming
      readers and waits until count of all (pending and active readers goes
      down to 0).
            
      Also, this lock is recusrive, for both readers and writers.
[11 Sep 2010 20:35] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/118038

3207 Vladislav Vaintroub	2010-09-11
      Bug#56585 : Slowdown on Windows in readonly  benchmark.
      
            
      Yet another take on this problem:
      implement rwlock using 2 critical sections and event.
      This implementation is cheaper than  general one on based on condition 
      variables,because it has less overhead compared to general one.
            
      For writer-mostly scenarios, the lock is equivalent to critical section,
      with just Enter/LeaveCriticalSection per rwlock/rwunlock pair.
            
      For concurent reader scenarios, the overhead is the same as in general 
      implementation (critical section enter/leave pair for both lock and unlock).
      plus there is extra overhead to wait or release the writer for the first entering
      and last existing reader).
            
      Compared to implementation on other platforms:
      1) reader and writer critical sections are split (so there is less 
           contention)
      2) Less critical section lock/unlock operations  for the writer.
      
            
      Performance-wise this lock seems to bring a lot, outperforming
      even native reader-writer locks, as in the table below, under
      column ("no fix" is 5.5,  "srwlock" is native Vista RW lock, thislock
       is implementation in this patch)
            
      -------------------------------------------------
       benchmark      | no fix     srwlock  thislock  
       -----------------------------------------------
      simple_ranges | 14051    17996   18459
       -----------------------------------------------
       point_select    | 20576     29120   29309
            
            
      Besides,this lock implements correctly (in Dmitri's sense of it)
      "prefer readers", e.g writer does not enter or block newly incoming
      readers and waits until count of all (pending and active readers goes
      down to 0).
            
      Also, this lock is recursive, for both readers and writers.
[13 Sep 2010 8:00] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/118054

3208 Vladislav Vaintroub	2010-09-13
      Bug#56585 : use "int" instead of "volatile LONG" for reader_count.
      
      volatile LONG is the type used by InterlockedXXX operations and 
      those were removed in the latest patch. 
      
      Access to reader_count always happens inside critical sections, i.e 
      with memory barriers. Reordering by compiler is not possible either,
      so volatile is superfluous and is now removed. LONG is replaced
      by int for symmetric and aesthetic reasons.
[21 Sep 2010 18:51] Vladislav Vaintroub
different rwlock implementations

Attachment: windows_rwlock_implementations.pdf (application/pdf, text), 250.83 KiB.

[21 Sep 2010 19:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/118738

3206 Vladislav Vaintroub	2010-09-21
      Bug #56585 Slowdown of readonly sysbench benchmarks (e.g point_select)
       on Windows in 5.5
       
      This is the port of Vance Morrison's one from "Low-Lock Techniques in
      action: Implementing a Reader-Writer lock" originally written in C# and
      available here http://blogs.msdn.com/b/vancem/archive/2006/03/28/563180.aspx
      
      This implementation features own fast spinlock to protect the structure,
      and lazy event initialization.
      
      Changes to Morrison's implementation
      1) "prefer readers" mode (starving writers in presence of readers)
      2) Writers are protected by critical section, which helps *a lot* to reduce 
      kernel-mode CPU consumption in high-concurrency this is one of the cases, 
      where adding additional synchronization help peformance. 
      
      The likely reason for high CPU consumption is lock convoy - kernel objects 
      like event are trying to be fair (FIFO), which confuses the dispatcher when
      locks are highly contended and held for short time.
      
      Critical Section avoid lock convoys, and for our degenerate RWlock case 
      with high writer contention for very short code paths, and they do not add
      much overhead to the lighweight implementation.
      
      Prior patch http://lists.mysql.com/commits/118038 was exclusively 
      focussed on improving writers in absense of any readers, this one is more 
      balanced and gives improvement in different scenarios, also involving readers,
      ,while maintaining almost the same speed for writers-only.
      
      This makes given implementation more suitable as general purpose rwlock, 
      the prior one only fixed a very special case (prlock).
      
      Performance comparison of this implementation with prior patch can be found in 
      http://bugs.mysql.com/file.php?id=15788 (this patch is called  "morrison", 
      and prior is called "2critsec").
[22 Sep 2010 13:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/118829

3206 Vladislav Vaintroub	2010-09-22
      Bug #56585 Slowdown of readonly sysbench benchmarks (e.g point_select)
       on Windows in 5.5
       
      This is the port of Vance Morrison's one from "Low-Lock Techniques in
      action: Implementing a Reader-Writer lock" originally written in C# and
      available here http://blogs.msdn.com/b/vancem/archive/2006/03/28/563180.aspx
      
      This implementation features own fast spinlock to protect the structure,
      and lazy event initialization.
      
      Changes to Morrison's implementation
      1) "prefer readers" mode (starving writers in presence of readers)
      2) Writers are protected by critical section, which helps *a lot* to reduce 
      kernel-mode CPU consumption in high-concurrency this is one of the cases, 
      where adding additional synchronization helps peformance.  
      
      The likely reason for high CPU consumption is lock convoy - kernel objects 
      like event are trying to be fair (FIFO), which confuses the dispatcher when
      locks are highly contended and held for short time.
      
      Critical Section avoid lock convoys, and for our unusual RWlock case 
      with high writer contention for very short code paths, and they do not add
      much overhead to the lightweight implementation.
      
      Prior patch http://lists.mysql.com/commits/118038 was exclusively 
      focused on improving writers in absence of any readers, this one is more 
      balanced and gives improvement in different scenarios, also involving readers,
      ,while maintaining almost the same speed for writers-only.
      
      This makes given implementation more suitable as general purpose rwlock, 
      the prior one only fixed a very special case (prlock).
      
      Performance comparison of this implementation with prior patch can be found in 
      http://bugs.mysql.com/file.php?id=15788 (this patch is called  "morrison", 
      and prior is called "2critsec")
[29 Sep 2010 12:20] Dmitry Lenev
Hello!

Fix for bug #56715 "Concurrent transactions + FLUSH result in sporadical unwarranted deadlock errors", which was queued into mysql-5.5-runtime tree, solves slowdown in POINT_SELECT test on Windows as well. Unfortunately it introduces regression in OLTP_RO/MYISAM test on this platform. So separate follow-up patch tackling the latter issue should be developed.
[29 Sep 2010 12:28] Dmitry Lenev
For example such a follow-up can look like:

http://lists.mysql.com/commits/119204
[3 Oct 2010 18:07] Vladislav Vaintroub
effect of native rwlock/conditions patch

Attachment: native_rwlock_condition_patch.pdf (application/pdf, text), 94.89 KiB.

[3 Oct 2010 18:13] Vladislav Vaintroub
the attachement above contains performance number for this patch 
(OLTP_RO/MyISAM only, measured on 4 core Win2008R2 machine)
[3 Oct 2010 18:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/119786

3220 Vladislav Vaintroub	2010-10-03
      A follow-up to the patch for bug #56405 "Deadlock in the MDL deadlock
      detector". This patch addresses performance regression in OLTP_RO/MyISAM
      test on Windows introduced by the fix for bug #56405. Thus it makes
      original patch acceptable as a solution for bug #56585 "Slowdown of
      readonly sysbench benchmarks (e.g point_select) on Windows 5.5".
      
      With this patch, MySQL will use native Windows condition variables and 
      reader-writer locks  if  they are supported by the OS.
      
      This speeds up MyISAM and the effect comes mostly from using native
      rwlocks. Native conditions improve scalability with higher number of 
      concurrent users in other situations, e.g for prlocks.
      
      Benchmark numbers for this patch as measured on Win2008R2 quad
      core machine are attached to the bug report.
      ( direct link http://bugs.mysql.com/file.php?id=15883 )
      
      Note, that currently we require at least Windows7/WS2008R2 for 
      reader-writer locks, even if native rwlock is available also on Vista.
      Reason is that "trylock" APIs are missing on Vista, and trylock is used in
      the server (in a single place in query cache).
      
      While this patch could have been written differently, to enable the native
      rwlock optimization also on Vista/WS2008 (e.g using native locks everywhere
      but portable implemenetation in query cache), this would come at the 
      expense of the code clarity, as it would introduce a bew  "tryable" rwlock
      type, to handle Vista case.
      
      Another way to improve performance for the special case 
      (OLTP_RO/MYISAM/Vista) would be to eliminate "trylock" usage from server,
       but this is outside of the scope here.
[4 Oct 2010 0:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/119787

3220 Vladislav Vaintroub	2010-10-04
      A follow-up to the patch for bug #56405 "Deadlock in the MDL deadlock
      detector". This patch addresses performance regression in OLTP_RO/MyISAM
      test on Windows introduced by the fix for bug #56405. Thus it makes
      original patch acceptable as a solution for bug #56585 "Slowdown of
      readonly sysbench benchmarks (e.g point_select) on Windows 5.5".
      
      With this patch, MySQL will use native Windows condition variables and 
      reader-writer locks  if  they are supported by the OS.
      
      This speeds up MyISAM and the effect comes mostly from using native
      rwlocks. Native conditions improve scalability with higher number of 
      concurrent users in other situations, e.g for prlocks.
      
      Benchmark numbers for this patch as measured on Win2008R2 quad
      core machine are attached to the bug report.
      ( direct link http://bugs.mysql.com/file.php?id=15883 )
      
      Note that currently we require at least Windows7/WS2008R2 for 
      reader-writer locks, even though native rwlock is available also on Vista.
      Reason is that "trylock" APIs are missing on Vista, and trylock is used in
      the server (in a single place in query cache).
      
      While this patch could have been written differently, to enable the native
      rwlock optimization also on Vista/WS2008 (e.g using native locks everywhere
      but portable implementation in query cache), this would come at the 
      expense of the code clarity, as it would introduce a new  "try-able" rwlock
      type, to handle Vista case.
      
      Another way to improve performance for the special case 
      (OLTP_RO/MYISAM/Vista) would be to eliminate "trylock" usage from server,
       but this is outside of the scope here.
      
      
      Native conditions variables are used beginning with Vista though the effect
      of using condition variables alone is not measurable in this benchmark.
      But when used together with native rwlocks on Win7, native conditions improve 
      performance in high-concurrency OLTP_RO/MyISAM (128 and more sysbench 
      users).
[4 Oct 2010 11:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/119817

3220 Vladislav Vaintroub	2010-10-04
      A follow-up to the patch for bug #56405 "Deadlock in the MDL deadlock
      detector". This patch addresses performance regression in OLTP_RO/MyISAM
      test on Windows introduced by the fix for bug #56405. Thus it makes
      original patch acceptable as a solution for bug #56585 "Slowdown of
      readonly sysbench benchmarks (e.g point_select) on Windows 5.5".
      
      With this patch, MySQL will use native Windows condition variables and 
      reader-writer locks  if  they are supported by the OS.
      
      This speeds up MyISAM and the effect comes mostly from using native
      rwlocks. Native conditions improve scalability with higher number of 
      concurrent users in other situations, e.g for prlocks.
      
      Benchmark numbers for this patch as measured on Win2008R2 quad
      core machine are attached to the bug report.
      ( direct link http://bugs.mysql.com/file.php?id=15883 )
      
      Note that currently we require at least Windows7/WS2008R2 for 
      reader-writer locks, even though native rwlock is available also on Vista.
      Reason is that "trylock" APIs are missing on Vista, and trylock is used in
      the server (in a single place in query cache).
      
      While this patch could have been written differently, to enable the native
      rwlock optimization also on Vista/WS2008 (e.g using native locks everywhere
      but portable implementation in query cache), this would come at the 
      expense of the code clarity, as it would introduce a new  "try-able" rwlock
      type, to handle Vista case.
      
      Another way to improve performance for the special case 
      (OLTP_RO/MYISAM/Vista) would be to eliminate "trylock" usage from server,
       but this is outside of the scope here.
      
      
      Native conditions variables are used beginning with Vista though the effect
      of using condition variables alone is not measurable in this benchmark.
      But when used together with native rwlocks on Win7, native conditions improve 
      performance in high-concurrency OLTP_RO/MyISAM (128 and more sysbench 
      users).
[4 Oct 2010 11:52] Vladislav Vaintroub
queued to 5.5-bugteam, trunk-merge
[9 Nov 2010 19:43] Bugs System
Pushed into mysql-5.5 5.5.7-rc (revid:sunanda.menon@sun.com-20101109182959-otkxq8vo2dcd13la) (version source revid:sunanda.menon@sun.com-20101109182959-otkxq8vo2dcd13la) (merge vers: 5.5.7-rc) (pib:21)
[13 Nov 2010 16:04] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:alexander.nozdrin@oracle.com-20101113152450-2zzcm50e7i4j35v7) (merge vers: 5.6.1-m4) (pib:21)
[13 Nov 2010 16:29] Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (pib:21)
[1 Dec 2010 17:32] Tony Bedford
An entry has been added to the 5.5.7 changelog:

        Performance for certain read-only queries, in particular
        point_select, had deteriorated compared to
        previous versions.
[25 Dec 2010 13:41] James Day
Most of the discussion here seems to be for MyISAM. Does this affect other storage engines, particularly InnoDB?
[25 Dec 2010 14:43] Dmitry Lenev
Hello James!

Yes, InnoDB was affected. Actually, this problem was originally reported for sysbench POINT_SELECT test for InnoDB engine (and to be perfectly honest I am not sure that it was repeatable for MyISAM).

Problem for POINT_SELECT/InnoDB was solved by fix for bug #56405.

But unfortunately this fix introduced small performance regression in OLTP_RO/MyISAM test. Vlad's follow-up patch has fixed this regression.

Please note that both the original problem with POINT_SELECT/InnoDB and the second problem with OLTP_RO/MyISAM were repeatable only on Windows.

Hope this makes things a bit more clear!
[25 Dec 2010 15:17] James Day
Thanks Dmitri!