Bug #28249 Query Cache returns wrong result with concurrent insert / certain lock
Submitted: 4 May 2007 18:01 Modified: 23 Jul 2007 22:23
Reporter: Martin Friebe (Gold Quality Contributor) (SCA)
Status: Closed
Category:Server Severity:S2 (Serious)
Version:5.0.42/5,1 OS:Any
Assigned to: Kristofer Pettersson Target Version:
Tags: wrong result, qcache, query cache, qc

[4 May 2007 18:01] Martin Friebe
Description:
The query cache returns invalid data in some cases.

The problem is to reproduce it, The below works for me (on 5.0.42, not on 4.1).

It does have to do something with the locking of the tables. And therfore reproducability
may depend on the internal table lock order (memory position of the structures), which can
not be forced.
It may or may not help to swap the creation order or the names of the tables....

If instead of using locks, a long running query is used and a concurrent insert happens,
then everything is correct.
I conclude that there is a race condition while the tables are opened.

You will need 3 connections to the server.

- connect 2: the lock on t2, will hold back the select.

However the select allready manages to hold on to t1.
The 1st select does only return 3 rows (once t2 is unlocked), therefore the select must
have opened and locked t1, and therefore the insert is a concurrent insert.

If the select does attempt to lock t2 before t1, then the below will not show the issue.
This can be noted, if the first select returns 4 rows.

- 
If theconcurrent insert has happened, then the 3 rows result goes into the query cache,
even so the table does now have 4 rows.

How to repeat:
set GLOBAL query_cache_type=1;   set GLOBAL query_cache_limit=10000;
set GLOBAL query_cache_min_res_unit=0; set GLOBAL query_cache_size= 100000;

flush tables;
create database new_db; use new_db;

drop table if exists t1, t2;
create table t1 (a int);
create table t2 (a int);
insert into t1 values (1),(2),(3);

# connect 2
use new_db;
lock table t2  write;

# connect 1
## this depends on the order locks are requested.
## this must 1st lock t1, before waiting for t2
## it will then cause a concurrent insert and 
## should return 3 rows
select *, (select count(*) from t2) from t1;

# connect 3
use new_db;
insert into t1 values (4);

# connect 2
unlock tables;

# connect 1
## this should ALL return 4 rows, as theinsert is long finished
select *, (select count(*) from t2) from t1;
reset query cache;
select *, (select count(*) from t2) from t1;

Suggested fix:
My best *guess* is (in sql_parse.cc "case SQLCOM_SELECT" line 2590ff

query_cache_store_query() is called immediadly after open_and_lock_tables().

But if we manage to open and lock one table inside open_and_lock_tables() and then have
to wait for another table, then the query is not yet in the qcache and will not be
invalidated by a concurrent update.

However, since the concurrent update is handled correctly (the 1st select returns nly 3
rows) the table handler must already have made a note of the original table size (and
therefore can ignore the added rows).

It should be possible to check any changes to this table size, imediatly *after*
inserting into the qcache, and if there are changes call the qcache_abort method for this
query.

Doing it after the insert still leaves a race condition, but now the result can only be
that a valid result is not cached, rather than an invalid result being cached. (And i
think this can be accepted, in favour of not needing an additional lock)
[4 May 2007 21:04] Miguel Solorzano
Thank you for the bug report.
[9 Jul 2007 10:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/30517

ChangeSet@1.2489, 2007-07-09 10:09:31+02:00, thek@adventure.(none) +5 -0
  Bug#28249 Query Cache returns wrong result with concurrent insert / certain lock
  
  A race condition in the integration between MyISAM and the query cache code 
  caused the query cache to fail to invalidate itself on concurrently inserted
  data.
  
  This patch fix this problem by using the existing handler interface which, upon
  each statement cache attempt, compare the size of the table as viewed from the 
  cache writing thread and with any snap shot of the global table state. If the
  two sizes are different the global table size is unknown and the current
  statement can't be cached.
[11 Jul 2007 8:47] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/30651

ChangeSet@1.2489, 2007-07-11 08:46:56+02:00, thek@adventure.(none) +5 -0
  Bug#28249 Query Cache returns wrong result with concurrent insert / certain lock
  
  A race condition in the integration between MyISAM and the query cache code 
  caused the query cache to fail to invalidate itself on concurrently inserted
  data.
  
  This patch fix this problem by using the existing handler interface which, upon
  each statement cache attempt, compare the size of the table as viewed from the 
  cache writing thread and with any snap shot of the global table state. If the
  two sizes are different the global table size is unknown and the current
  statement can't be cached.
[11 Jul 2007 15:52] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/30690

ChangeSet@1.2489, 2007-07-11 15:52:23+02:00, thek@adventure.(none) +5 -0
  Bug#28249 Query Cache returns wrong result with concurrent insert / certain lock
  
  A race condition in the integration between MyISAM and the query cache code 
  caused the query cache to fail to invalidate itself on concurrently inserted
  data.
  
  This patch fix this problem by using the existing handler interface which, upon
  each statement cache attempt, compare the size of the table as viewed from the 
  cache writing thread and with any snap shot of the global table state. If the
  two sizes are different the global table size is unknown and the current
  statement can't be cached.
[17 Jul 2007 17:29] Bugs System
Pushed into 5.0.48
[17 Jul 2007 17:30] Bugs System
Pushed into 5.1.21-beta
[23 Jul 2007 22:23] Paul DuBois
Noted in 5.0.48, 5.1.21 changelogs.

A race condition in the interaction between MyISAM and the query
cache code caused the query cache not to invalidate itself for
concurrently inserted data.