Bug #39253 Large query cache still freezes server after fix for bug #21074
Submitted: 4 Sep 2008 20:47 Modified: 21 Nov 12:27
Reporter: Sveta Smirnova
Status: Patch queued
Category:Server: Query Cache Severity:S2 (Serious)
Version:5.1 bzr OS:Any
Assigned to: Kristofer Pettersson Target Version:5.1+
Triage: Triaged: D2 (Serious) / R2 (Low) / E2 (Low)

[4 Sep 2008 20:47] Sveta Smirnova
Description:
Large query cache still freezes server after bug #21074 has been fixed if one issues
UPDATE query.

How to repeat:
Set query_cache_size to 2G, then use test case from bug #21074, but replace insert query
with update:

$cat diff.txt 
--- qc01_orig.c 2008-09-04 20:43:31.000000000 +0200
+++ update.c    2008-08-30 16:24:09.000000000 +0200
@@ -22,7 +22,7 @@
 char username[]="root";
 char password[]="";
 char database[]="test";
-int port=3306;
+int port=33051;
 pthread_t pthreads[NUMTHREADS];
 pthread_t inserter;
 
@@ -82,7 +82,7 @@
 
        c=shortquery;
        c+=sprintf(c,"set global query_cache_size=%d",QUERYCACHESIZE);
-       db_query(dbc,shortquery,1);
+       //db_query(dbc,shortquery,1);
 
        c=shortquery;
        c+=sprintf(c,"%s","set global query_cache_type=1");
@@ -113,7 +113,7 @@
                 db_query(dbc,shortquery,1);
                 
                 c=shortquery;
-                c+=sprintf(c,"create table `qc01`.`q%04d`(`a` mediumint unsigned not
null primary key)engine=myisam",i);
+                c+=sprintf(c,"create table `qc01`.`q%04d`(`a` mediumint unsigned not
null primary key, b int default 0)engine=myisam",i);
                 db_query(dbc,shortquery,1);
         }
         
@@ -298,7 +298,8 @@
                 for(i=0;i<NUMTABLES/(100/PERCENTTABLESMODIFY);i++)
                 {
                         c=shortquery;
-                        c+=sprintf(c,"insert ignore into `qc01`.`q%04d`(`a`) values
(%lu)",(int)(lrand48()%NUMTABLES),(lrand48() % (ROWSPERTABLE*2)));
+                        //c+=sprintf(c,"insert ignore into `qc01`.`q%04d`(`a`) values
(%lu)",(int)(lrand48()%NUMTABLES),(lrand48() % (ROWSPERTABLE*2)));
+                       c+=sprintf(c, "update `qc01`.`q%04d` set b=%lu where
a=%lu",(int)(lrand48()%NUMTABLES),(lrand48() % (ROWSPERTABLE*2)),(lrand48() %
(ROWSPERTABLE*2)));
                         db_query(dbc,shortquery,1);
                }
 
@@ -393,3 +394,4 @@
        } while (0==moreresult);
        return 1;
 }
+

Then run tests.

I had 2 delays during running single test: 12 and 90 sec

Suggested fix:
Don't hang
[4 Sep 2008 20:47] Sveta Smirnova
Link to the original test: http://bugs.mysql.com/file.php?id=7219 Shane posted it
privately for some reason.
[23 Apr 18:48] Konstantin Osipov
Kristofer, can we discuss on IRC what's causing this? Assigning to you so that you just
see this message.
[28 Apr 12:24] Kristofer Pettersson
Progress notes
I've ran the test a few times. The last time I also created an artificial wait of 6.5s on
each UPDATE induced query cache invalidation. The resulting graph can be viewed in an
attachment. I don't see any evidence of a systematic error like the one in 21074, ie
there is no hang among the select threads. The SELECT intensity drops but that is
probably CPU-bound on my computer rather than a result of any threads being stuck. I was
expecting to see a 6+s window in the select count graph but there isn't any indication of
that.
[28 Apr 12:24] Kristofer Pettersson
Queries in qcache

Attachment: queries_in_qcache.png (image/png, text), 14.04 KiB.

[28 Apr 12:25] Kristofer Pettersson
SELECT intensity

Attachment: SELECT-intensity-qcache.png (image/png, text), 27.31 KiB.

[28 Apr 16:39] Kristofer Pettersson
Simple dtracing produces an average 0.02s execution time on mysql_parse with only a very
few 0.7s and 1.3s worst case scenarios (less than 10 in 240M).

#!/usr/sbin/dtrace -qws

pid$target::*mysql_parse*:entry
{
  self->ts= timestamp;
}

pid$target::*mysql_parse*:return
/self->ts != 0/
{
  printf("%u\n", timestamp - self->ts);
  self->t = 0;
}
[29 May 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[29 May 7:10] Sveta Smirnova
Kristofer,

which feedback do you need?
[30 Jun 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[3 Jul 2:13] James Day
Kristofer, please say why you have changed this bug from verified to can't repeat. Do you
need Sveta to help you find a test case that demonstrates the problem?
[6 Jul 14:46] Kristofer Pettersson
Yes please, submit a test case I can use.
[7 Jul 8:05] Sveta Smirnova
tests used

Attachment: sources.tar (application/x-tar, text), 70.00 KiB.

[7 Jul 8:20] Sveta Smirnova
Tests used uploaded.

Measured blocking time INSERT and UPDATE. Results will be uploaded. See second column
"selects since last printout" Sometimes it decreases to 0:

1992.389771,0,      401017,  ...
1995.390991,0,      519168,  ...
[7 Jul 8:51] Sveta Smirnova
test result with 50mb query cache

Attachment: out_witqc2g.tar.gz (application/x-gzip, text), 161.76 KiB.

[7 Jul 8:52] Sveta Smirnova
Sorry, 500 MB query cache
[7 Jul 9:47] Kristofer Pettersson
Locked select threads

Attachment: bild.png (image/png, text), 21.04 KiB.

[7 Jul 9:49] Kristofer Pettersson
Non-linear invalidation graph

Attachment: bild(2).png (image/png, text), 13.28 KiB.

[7 Jul 10:26] Kristofer Pettersson
Query cache free memory graph with spikes

Attachment: bild(3).png (image/png, text), 15.39 KiB.

[7 Jul 10:48] Kristofer Pettersson
select intesity in corralation to free cache memory

Attachment: bild(5).png (image/png, text), 30.64 KiB.

[14 Jul 18:26] Sveta Smirnova
results for 1G qc with 5.1-bugteam

Attachment: 20090714_log_1GB.txt (text/plain), 288.21 KiB.

[14 Jul 18:26] Sveta Smirnova
results for 500MB qc with 5.1-bugteam

Attachment: 20090714_log_500MB.txt (text/plain), 287.99 KiB.

[14 Jul 18:28] Sveta Smirnova
Just tested with 5.1-bugteam tree. Works much better, although has such periods in case of
update:

2797.455322,15728,    ...
2800.457275,0,          ...
2803.459473,14884,     ...
[15 Jul 9:39] Sveta Smirnova
chart with inserts and qc=500 MB

Attachment: inserts_500MB.png (image/png, text), 14.25 KiB.

[15 Jul 9:40] Sveta Smirnova
chart with updates and qc=500 MB

Attachment: updates_500MB.png (image/png, text), 14.74 KiB.

[15 Jul 9:41] Sveta Smirnova
chart with inserts and qc=1 GB

Attachment: inserts_1GB.png (image/png, text), 14.22 KiB.

[15 Jul 9:42] Sveta Smirnova
chart with updates and qc=1 GB

Attachment: updates_1GB.png (image/png, text), 14.25 KiB.

[15 Jul 9:46] Sveta Smirnova
Charts are at:

500MB updates:
http://spreadsheets.google.com/pub?key=tfVT1Ym3da9JHiYDFogMWIQ&oid=1&output=image

500MB inserts:
http://spreadsheets.google.com/pub?key=tcSrFLCG-XKY6kC0Ym2FQ_A&oid=1&output=image

1GB updates:
http://spreadsheets.google.com/pub?key=ti5iYuXGDzpI29eXPQ_4NMQ&oid=1&output=image

1GB inserts:
http://spreadsheets.google.com/pub?key=tAXboRCwk5urcZRYXyUyXaQ&oid=1&output=image
[15 Jul 9:55] Sveta Smirnova
chart with updates and qc=500 MB - better format

Attachment: updates_500MB.png (image/png, text), 19.98 KiB.

[15 Jul 9:56] Sveta Smirnova
chart with updates and qc=1 GB - better format

Attachment: updates_1GB.png (image/png, text), 19.72 KiB.

[15 Jul 9:59] Sveta Smirnova
Problem is some period with 0 selects during 3 seconds.
[15 Jul 9:59] Sveta Smirnova
chart with inserts and qc=500 MB - better format

Attachment: inserts_500MB.png (image/png, text), 20.32 KiB.

[15 Jul 10:00] Sveta Smirnova
chart with inserts and qc=1 GB - better format

Attachment: inserts_1GB.png (image/png, text), 19.65 KiB.

[29 Jul 23:46] Sveta Smirnova
opcontrol output for 500MB query cache

Attachment: 20090729_opgrof_500MB.out.txt (text/plain), 153.42 KiB.

[29 Jul 23:46] Sveta Smirnova
opcontrol output for 1G query cache (+500MB I think)

Attachment: 20090729_opgrof_1G.out.txt (text/plain), 157.95 KiB.

[29 Jul 23:59] Sveta Smirnova
To get oprofile output tested mysql-5.1-bugteam compiled with debug=full. Results are
worse than before for mysql-5.1-bugteam. New results will be uploaded shortly.
[30 Jul 0:00] Sveta Smirnova
results

Attachment: 20090729.results.tar.gz (application/x-gzip, text), 175.80 KiB.

[30 Jul 0:01] Sveta Smirnova
chart with inserts and qc=1 GB

Attachment: 20090729_1G.inserts.png (image/png, text), 19.75 KiB.

[30 Jul 0:02] Sveta Smirnova
chart with updates and qc=1 GB

Attachment: 20090729_1G.updates.png (image/png, text), 20.49 KiB.

[30 Jul 0:02] Sveta Smirnova
chart with inserts and qc=500 MB

Attachment: 20090729_500MB.inserts.png (image/png, text), 20.01 KiB.

[30 Jul 0:03] Sveta Smirnova
chart with updates and qc=500 MB

Attachment: 20090729_500MB.updates.png (image/png, text), 20.06 KiB.

[14 Aug 17:26] Kristofer Pettersson
Progress report.
The invalidation performance is two fold:
1. The invalidation execution path seems slower than it needs to be.
2. As the QC invalidate a subset of queries, all other threads need to wait. This means
that any performance issue in (1) will be multiplied by the number of waiting threads.
The wait is a regression compared to the patch in 21074 as a result of fixing bug43758
which enabled us to refactor the lock API to use a consistent approach to locking the QC.

To find hotspots in the invalidation path I've used dtrace sampling on a test case
invalidating 1 GB of data from a single table. From this I see that my_hash_sort_bin()
and my_hash_delete() are potential bottle necks.

my_hash_sort_bin() is used when calculating the hash for a binary string and the function
used comes from the strings/ directory.

A close inspection on my_hash_delete() using google perf tools doesn't give away too many
clues though (see attachment).

It also seems likely that the linear search performed within each memory bin can be
inefficient with certain load patterns. I wasn't able to show this with my test case
though.

Unfortunately I turns out that gprof doesn't work in macosx and for some reason and Shark
refuses to sample a running mysqld process. I will have to switch to ubuntu to continue
with some alternative profiling.
[14 Aug 17:28] Kristofer Pettersson
google-perftool-on-hash-test

Attachment: my_hash_delete-profile-release-build.pdf (application/pdf, text), 22.79 KiB.

[9 Sep 10:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/82773

3048 Kristofer Pettersson	2009-09-09
      Bug#39253 Large query cache still freezes server after fix for bug #21074
      
      During a cache invalidation the server will be blocked on a mutex waiting for the
query cache to finish.
      This patch reduces the time for this wait by making hash delete operations more
efficient.
     @ sql/qc_hash.cc
        * Implemented new hash table based on chained hash entries and Jenkins hash
function.
         - The hash entry is stored in the Query cache blocks to avoid internal memory
allocations.
         - Delete is now a simple pointer operation instead of a linear search.
         - Hash search reorders hash entries by attempting to move the most accessed
entry first in the chain.
     @ sql/qc_hash.h
        * Implemented new hash table based on chained hash entries and Jenkins hash
function
     @ sql/sql_cache.cc
        * Implemented new hash table based on chained hash entries and Jenkins hash
function
         - Replaced free_query() with dettach() and free_query_internal()
     @ sql/sql_cache.h
        * Implemented new hash table based on chained hash entries and Jenkins hash
function
     @ sql/sql_plist.h
        * back port of double linked list from alazea tree.
[13 Nov 20:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/90392

3192 Kristofer Pettersson	2009-11-13
      Bug#39253 Large query cache still freezes server after fix for bug #21074
      
      When the query cache is fragmented, the size of
      the free block lists in the memory bins grow which
      causes the query cache invalidation to become slow.
      
      Even if this is countered by defragmentation the
      hash_delete() operation still creates a bottle necks
      which can cause the server to hang for a short period
      of time depending on the size of the cache.
      
      This patch introduce a limit on the time the qc can
      block with a lock on SELECTs.
      
      Other operations which causes a change in the table
      data will still be blocked.
     @ sql/sql_cache.cc
        * Introduced a timeout value for the qc lock when entering
send_result_to_client() and store_query() methods.
     @ sql/sql_cache.h
        * New signature for Query_cache::try_lock()
[20 Nov 13:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/91071

3191 Kristofer Pettersson	2009-11-20
      Bug#39253 Large query cache still freezes server after fix for bug #21074
      
      This patch introduce a limit on the time the query cache can
      block with a lock on SELECTs.
      
      Other operations which causes a change in the table
      data will still be blocked.
     @ sql/sql_cache.cc
        * Introduced a timeout value for the qc lock when entering
send_result_to_client()
        and store_query() methods.
     @ sql/sql_cache.h
        * New signature for Query_cache::try_lock()