Bug #39249 Maria: query cache returns out-of-date results
Submitted: 4 Sep 2008 15:25 Modified: 26 May 2010 17:50
Reporter: Guilhem Bichot Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: Maria storage engine Severity:S3 (Non-critical)
Version:5.1-maria, 6.0-bk OS:Linux
Assigned to: CPU Architecture:Any

[4 Sep 2008 15:25] Guilhem Bichot
Description:
This problem happens when running:
 ./mtr --mem query_cache --mysqld=--default-storage-engine=maria:
@@ -1641,7 +1641,6 @@
 1	0
 2	0
 3	0
-4	0
 reset query cache;
 select *, (select count(*) from t2) from t1;
 a	(select count(*) from t2)

so don't forget to run this test with the above mtr options before closing the bug.
Note that the relevant test part starts with 
Bug#28249 Query Cache returns wrong result with concurrent insert/ certain lock
so maybe we wrongly merged the fix of BUG#28249 from MyISAM to Maria??

How to repeat:
Here is a testcase extracted from query_cache.test:
--echo Bug#28249 Query Cache returns wrong result with concurrent insert/ certain lock
connect (user1,localhost,root,,test,,);
connect (user2,localhost,root,,test,,);
connect (user3,localhost,root,,test,,);

connection user1;

set GLOBAL query_cache_type=1;
set GLOBAL query_cache_limit=10000;
set GLOBAL query_cache_min_res_unit=0;
set GLOBAL query_cache_size= 100000;

flush tables;
--disable_warnings
drop table if exists t1, t2;
--enable_warnings
create table t1 (a int);
create table t2 (a int);
insert into t1 values (1),(2),(3);
connection user2;
--echo Locking table T2 with a write lock.
lock table t2 write;

connection user1;
--echo Select blocked by write lock.
--send select *, (select count(*) from t2) from t1;
--echo Sleeing is ok, because selecting should be done very fast.
sleep 5;

connection user3;
--echo Inserting into table T1.
insert into t1 values (4);

connection user2;
--echo Unlocking the tables.
unlock tables;

connection user1;
--echo Collecting result from previously blocked select.
# 
# Since the lock ordering rule in thr_multi_lock depends on
# pointer values, from execution to execution we might have
# different lock order, and therefore, sometimes lock t1 and block
# on t2, and sometimes block on t2 right away. In the second case,
# the following insert succeeds, and only then this select can
# proceed, and we actually test nothing, as the very first select
# returns 4 rows right away.
# It's fine to have a test case that covers the problematic area
# at least once in a while.
# We, however, need to disable the result log here to make the
# test repeatable.
--disable_result_log
--reap
--enable_result_log
--echo Next select should contain 4 rows, as the insert is long finished.
sleep 5;
select *, (select count(*) from t2) from t1;
select *, (select count(*) from t2) from t1;
select *, (select count(*) from t2) from t1;
reset query cache;
select *, (select count(*) from t2) from t1;

I see that the 3 SELECTs before "reset query cache" return (1,0),(2,0),(3,0), which is wrong: they should additionally show (4,0) like the last SELECT does.
Indeed: note that these 3 SELECTs run 5 seconds after the INSERT returned ok (INSERT returned ok => it committed => it unlocked table => 5 seconds after that, its effects should certainly be visible to "user1").
Note that "user1" is not using LOCK TABLES, so it's not a question of "user1" normally seeing table as it was at LOCK TABLES.
Query cache probably keeps too old data.
[13 Oct 2008 13:06] Georgi Kodinov
I've seen this while testing in 6.0-bugteam : 
main.ps_not_windows            [ pass ]             21
main.query_cache               [ fail ]

--- /home/kgeorge/mysql/bzr/merge-6.0-bugteam/mysql-test/r/query_cache.result  2008-10-13 13:27:43.000000000 +0300
+++ /home/kgeorge/mysql/bzr/merge-6.0-bugteam/mysql-test/r/query_cache.reject  2008-10-13 16:04:57.000000000 +0300
@@ -1585,14 +1585,12 @@
 1      0
 2      0
 3      0
-4      0
 reset query cache;
 select *, (select count(*) from t2) from t1;
 a      (select count(*) from t2)
 1      0
 2      0
 3      0
-4      0
 drop table t1,t2;
 #
 # Bug#25132 disabled query cache: Qcache_free_blocks = 1

mysqltest: Result content mismatch

Stopping All Servers
Restoring snapshot of databases
Saving core.18412
Saving core.18440
Saving core.18531
Saving core.18558
Resuming Tests

main.query_cache_debug         [ skipped ]   Test needs 'big-test' opt
[15 Oct 2008 9:24] Kristofer Pettersson
This is what I think is the situation:

Under open_and_lock_tables, in execute_sqlcom_select the query cache is asked to attach a result set writer intercepting the current statement. Before this is allowed a call to ha_myisam::register_query_cache_table is issued to assert that the current statement can be cached.

In this test case either a concurrent insert or a regular insert has happened depending on the order locks were acquired on t1, t2. In the case of concurrent insert, 'register_query_cache_table' should return false to indicate that the statement shouldn't be cached. In the second case, insert operation should have been visible to the select operation waiting on the write lock and the result should have been the same.

Could you verify that the code in ha_myisam::register_query_cache_table is correct for getting a consistent snap shot on the state of myisam which can be used in this context?

If the error on the other hand happens on a regular insert with a select hanging on lock tables, then maybe the qc api has a more serious flaw, but I don't think this is the case.

Do you agree?
[21 Nov 2008 18:38] Guilhem Bichot
In BUG#40179, query_cache.test fails in default configuration (i.e. using MyISAM, not Maria), with exactly the same output (same missing line at line 1641):

8085: @@ -1641,7 +1641,6 @@
38086:  1	0
38087:  2	0
38088:  3	0
38089: -4	0
38090:  reset query cache;
38091:  select *, (select count(*) from t2) from t1;
38092:  a	(select count(*) from t2)

as in the Description of the present BUG#39249.
So I'm changing category to Server:QueryCache, marking it as duplicate of BUG#40179 and de-assigning it from Sanja.
[1 Apr 2009 9:39] Guilhem Bichot
http://bugs.mysql.com/bug.php?id=41098 and
http://bugs.mysql.com/bug.php?id=40179
have been fixed, but only for MyISAM. The bugfix revision
davi.arnaut@sun.com-20090219210935-9vilvcisyyieffxl has been merged into 5.1-maria but not propagated to storage/maria as it was not trivial to do (in Maria there are several ma*get_status() functions unlike MyISAM).
So this bug still exists (at least I can repeat the problem of the reduced test case). We should propagate Davi's fix, and see if it fixes all testcases (including row_format=page|dynamic).