Bug #28249 Query Cache returns wrong result with concurrent insert / certain lock
Submitted: 4 May 2007 16:01 Modified: 23 Jul 2007 20:23
Reporter: Martin Friebe (Gold Quality Contributor) (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.42/5,1 OS:Any
Assigned to: Kristofer Pettersson CPU Architecture:Any
Tags: qc, qcache, query cache, wrong result

[4 May 2007 16:01] Martin Friebe
Description:
The query cache returns invalid data in some cases.

The problem is to reproduce it, The below works for me (on 5.0.42, not on 4.1).

It does have to do something with the locking of the tables. And therfore reproducability may depend on the internal table lock order (memory position of the structures), which can not be forced.
It may or may not help to swap the creation order or the names of the tables....

If instead of using locks, a long running query is used and a concurrent insert happens, then everything is correct.
I conclude that there is a race condition while the tables are opened.

You will need 3 connections to the server.

- connect 2: the lock on t2, will hold back the select.

However the select allready manages to hold on to t1.
The 1st select does only return 3 rows (once t2 is unlocked), therefore the select must have opened and locked t1, and therefore the insert is a concurrent insert.

If the select does attempt to lock t2 before t1, then the below will not show the issue. This can be noted, if the first select returns 4 rows.

- 
If theconcurrent insert has happened, then the 3 rows result goes into the query cache, even so the table does now have 4 rows.

How to repeat:
set GLOBAL query_cache_type=1;   set GLOBAL query_cache_limit=10000;
set GLOBAL query_cache_min_res_unit=0; set GLOBAL query_cache_size= 100000;

flush tables;
create database new_db; use new_db;

drop table if exists t1, t2;
create table t1 (a int);
create table t2 (a int);
insert into t1 values (1),(2),(3);

# connect 2
use new_db;
lock table t2  write;

# connect 1
## this depends on the order locks are requested.
## this must 1st lock t1, before waiting for t2
## it will then cause a concurrent insert and 
## should return 3 rows
select *, (select count(*) from t2) from t1;

# connect 3
use new_db;
insert into t1 values (4);

# connect 2
unlock tables;

# connect 1
## this should ALL return 4 rows, as theinsert is long finished
select *, (select count(*) from t2) from t1;
reset query cache;
select *, (select count(*) from t2) from t1;

Suggested fix:
My best *guess* is (in sql_parse.cc "case SQLCOM_SELECT" line 2590ff

query_cache_store_query() is called immediadly after open_and_lock_tables().

But if we manage to open and lock one table inside open_and_lock_tables() and then have to wait for another table, then the query is not yet in the qcache and will not be invalidated by a concurrent update.

However, since the concurrent update is handled correctly (the 1st select returns nly 3 rows) the table handler must already have made a note of the original table size (and therefore can ignore the added rows).

It should be possible to check any changes to this table size, imediatly *after* inserting into the qcache, and if there are changes call the qcache_abort method for this query.

Doing it after the insert still leaves a race condition, but now the result can only be that a valid result is not cached, rather than an invalid result being cached. (And i think this can be accepted, in favour of not needing an additional lock)
[4 May 2007 19:04] MySQL Verification Team
Thank you for the bug report.
[9 Jul 2007 8:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/30517

ChangeSet@1.2489, 2007-07-09 10:09:31+02:00, thek@adventure.(none) +5 -0
  Bug#28249 Query Cache returns wrong result with concurrent insert / certain lock
  
  A race condition in the integration between MyISAM and the query cache code 
  caused the query cache to fail to invalidate itself on concurrently inserted
  data.
  
  This patch fix this problem by using the existing handler interface which, upon
  each statement cache attempt, compare the size of the table as viewed from the 
  cache writing thread and with any snap shot of the global table state. If the
  two sizes are different the global table size is unknown and the current
  statement can't be cached.
[11 Jul 2007 6:47] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/30651

ChangeSet@1.2489, 2007-07-11 08:46:56+02:00, thek@adventure.(none) +5 -0
  Bug#28249 Query Cache returns wrong result with concurrent insert / certain lock
  
  A race condition in the integration between MyISAM and the query cache code 
  caused the query cache to fail to invalidate itself on concurrently inserted
  data.
  
  This patch fix this problem by using the existing handler interface which, upon
  each statement cache attempt, compare the size of the table as viewed from the 
  cache writing thread and with any snap shot of the global table state. If the
  two sizes are different the global table size is unknown and the current
  statement can't be cached.
[11 Jul 2007 13:52] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/30690

ChangeSet@1.2489, 2007-07-11 15:52:23+02:00, thek@adventure.(none) +5 -0
  Bug#28249 Query Cache returns wrong result with concurrent insert / certain lock
  
  A race condition in the integration between MyISAM and the query cache code 
  caused the query cache to fail to invalidate itself on concurrently inserted
  data.
  
  This patch fix this problem by using the existing handler interface which, upon
  each statement cache attempt, compare the size of the table as viewed from the 
  cache writing thread and with any snap shot of the global table state. If the
  two sizes are different the global table size is unknown and the current
  statement can't be cached.
[17 Jul 2007 15:29] Bugs System
Pushed into 5.0.48
[17 Jul 2007 15:30] Bugs System
Pushed into 5.1.21-beta
[23 Jul 2007 20:23] Paul DuBois
Noted in 5.0.48, 5.1.21 changelogs.

A race condition in the interaction between MyISAM and the query
cache code caused the query cache not to invalidate itself for
concurrently inserted data.