Bug #51017 Batched key access gives wrong results for SELECT with LEFT JOIN, GROUP BY
Submitted: 9 Feb 2010 9:07 Modified: 13 Feb 2010 19:12
Reporter: Guilhem Bichot Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:6.0-codebase-bugfixing OS:Any
Assigned to: Evgeny Potemkin CPU Architecture:Any
Tags: BKA, join_cache_level, optimizer_switch

[9 Feb 2010 9:07] Guilhem Bichot
Description:
revision-id: tor.didriksen@sun.com-20100208113127-g8wvxcpc2dppmxxd

create table t1 (libname varchar(21) not null, city text, primary key (libname));
create table t2 (isbn varchar(21) not null, author text, title text, primary key (isbn));
create table t3 (isbn varchar(21) not null, libname varchar(21) not null, quantity int ,primary key (isbn,libname));
insert into t2 values ('001','Daffy','A duck''s life');
insert into t2 values ('002','Bugs','A rabbit\'s life');
insert into t2 values ('003','Cowboy','Life on the range');
insert into t2 values ('000','Anonymous','Wanna buy this book?');
insert into t2 values ('004','Best Seller','One Heckuva book');
insert into t2 values ('005','EveryoneBuys','This very book');
insert into t2 values ('006','San Fran','It is a san fran lifestyle');
insert into t2 values ('007','BerkAuthor','Cool.Berkley.the.book');
insert into t3 values('000','New York Public Libra','1');
insert into t3 values('001','New York Public Libra','2');
insert into t3 values('002','New York Public Libra','3');
insert into t3 values('003','New York Public Libra','4');
insert into t3 values('004','New York Public Libra','5');
insert into t3 values('005','New York Public Libra','6');
insert into t3 values('006','San Fransisco Public','5');
insert into t3 values('007','Berkeley Public1','3');
insert into t3 values('007','Berkeley Public2','3');
insert into t3 values('001','NYC Lib','8');
insert into t1 values ('New York Public Libra','New York');
insert into t1 values ('San Fransisco Public','San Fran');
insert into t1 values ('Berkeley Public1','Berkeley');
insert into t1 values ('Berkeley Public2','Berkeley');
insert into t1 values ('NYC Lib','New York');
select t2.isbn,city,t1.libname,count(t1.libname) as a from t3 left join t1 on t3.libname=t1.libname left join t2 on t3.isbn=t2.isbn group by city,t1.libname;
select t2.isbn,city,t1.libname,count(distinct t1.libname) as a from t3 left join t1 on t3.libname=t1.libname left join t2 on t3.isbn=t2.isbn group by city having count(distinct t1.libname) > 1;
select t2.isbn,city,t1.libname,count(distinct t1.libname) as a from t3 left join t1 on t3.libname=t1.libname left join t2 on t3.isbn=t2.isbn group by city having count(distinct concat(t1.libname,'a')) > 1;
select t2.isbn,city,@bar:=t1.libname,count(distinct t1.libname) as a
from t3 left join t1 on t3.libname=t1.libname left join t2
on t3.isbn=t2.isbn group by city having count(distinct
t1.libname) > 1;
select t2.isbn,city,concat(@bar:=t1.libname),count(distinct t1.libname) as a
from t3 left join t1 on t3.libname=t1.libname left join t2
on t3.isbn=t2.isbn group by city having count(distinct
t1.libname) > 1;
drop table t1,t2,t3;

Diff in the output of SELECTs between running with defaults (correct results) and running with --optimizer_join_cache_level=X where X>=7:

@@ -28,29 +28,29 @@
 isbn	city	libname	a
 007	Berkeley	Berkeley Public1	1
 007	Berkeley	Berkeley Public2	1
-000	New York	New York Public Libra	6
+001	New York	New York Public Libra	6

and similar differences afterwards.

Goes away with optimizer_join_cache_level<=6.
Bug found by running main.count_distinct with --optimizer_join_cache_level=8.

How to repeat:
see description
[9 Feb 2010 9:36] Sveta Smirnova
Thank you for the report.

Verified as described.
[13 Feb 2010 19:12] Evgeny Potemkin
The difference is in the 'isbn' field. It's a hidden field as it isn't under an aggregate function, nor it's present in the GROUP BY clause. Any particular value of such field isn't guaranteed (our manual states this). Since access method is changed the result value could change also.
Adding this field to GROUP BY list makes server to produce same result with and without BKA enabled.