Bug #41204 BKA returns errouneous result form some queries and some join_cache_levels
Submitted: 3 Dec 2008 15:04 Modified: 22 Nov 2010 0:32
Reporter: Bernt Marius Johnsen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:mysql-6.0-bka-preview OS:Linux
Assigned to: Igor Babaev CPU Architecture:Any

[3 Dec 2008 15:04] Bernt Marius Johnsen
Description:
Running with DBT3 scale factor 10 and MyISAM the following query:

select n_name, ps_partkey, sum(ps_supplycost * ps_availqty) as value1
  from partsupp, nation
    where n_regionkey in 
            (select r_regionkey from region where r_name='ASIA') and
          (ps_suppkey, n_nationkey) in 
             (select s_suppkey, s_nationkey from supplier where s_acctbal > 5000)
group by ps_partkey
  having sum(ps_supplycost * ps_availqty) >
         (
           select sum(ps_supplycost * ps_availqty) * 0.0001/10
             from partsupp, nation
               where n_regionkey in 
                       (select r_regionkey from region where r_name='ASIA') and
                     (ps_suppkey, n_nationkey) in 
                       (select s_suppkey, s_nationkey from supplier 
                          where s_acctbal > 5000)
         )
order by value1 desc
limit 5;

With join_cache_level=default and join_buffer_size=default (actually for all values in the range 0-4) I get:

+-----------+------------+-------------+
| n_name    | ps_partkey | value1      |
+-----------+------------+-------------+
| INDIA     |     670669 | 24112798.53 | 
| CHINA     |     782578 | 23558222.76 | 
| INDONESIA |    1146607 | 22886320.50 | 
| INDIA     |    1584214 | 22418222.84 | 
| INDONESIA |     234575 | 22160293.87 | 
+-----------+------------+-------------+

With join_cache_levels above 4 I start to get different results.

E.g. join_cache_level=5 and join_buffer_size=default gives

+-----------+------------+-------------+
| n_name    | ps_partkey | value1      |
+-----------+------------+-------------+
| INDIA     |     670669 | 24112798.53 | 
| CHINA     |     782578 | 23558222.76 | 
| INDONESIA |    1146607 | 22886320.50 | 
| INDONESIA |    1584214 | 22418222.84 | 
| INDONESIA |     234575 | 22160293.87 | 
+-----------+------------+-------------+

While join_cache_level=5 and join_buffer_size=1024*1024*64 gives

+-----------+------------+-------------+
| n_name    | ps_partkey | value1      |
+-----------+------------+-------------+
| INDIA     |     670669 | 24112798.53 | 
| CHINA     |     782578 | 23558222.76 | 
| INDONESIA |    1146607 | 22886320.50 | 
| INDONESIA |    1584214 | 22418222.84 | 
| VIETNAM   |     234575 | 22160293.87 | 
+-----------+------------+-------------+

How to repeat:
Establish DBT3 scale factor 10 with MyISAM and run the given query with different values for join_cache_level and join_buffer_size
[3 Dec 2008 15:14] Bernt Marius Johnsen
The query-plans

Attachment: bka-qp.txt (text/plain), 9.04 KiB.

[4 Dec 2008 6:42] Valeriy Kravchuk
Thank you for a problem report. Can you, please, upload a dump of tables used or just *.frm, *.MYD and *.MYI files (compressed) for them.
[4 Dec 2008 6:45] Igor Babaev
According to the SQL standard this query is not a valid query with a GROUP BY clause. It is detected as an invalid query in the strict mode for GROUP BY queries (set sql_mode=ONLY_FULL_GROUP_BY).
The query is not valid because it uses n_name in the select list which is not included in GROUP_BY list.
As we may have different value of n_name for the same group the result set depends on the algorithm we use. Different algorithms generate the rows that are passed for sorting performed by the GROUP_BY operation in a different order.
To make the query valid with the result set independent on the employed algorithm it's enough to include n_name into the GROUP_BY list.
[4 Dec 2008 11:49] Bernt Marius Johnsen
By adding n_name to the GROUP by clause, the query behaved correctly for join_cache_level=5 and also for 7 & 8. But for join_cache_level=6 and join_buffer_size=1024 it is still wrong, and suspicously fast:

mysql> set join_cache_level=0;
Query OK, 0 rows affected (0.00 sec)

mysql> select n_name, ps_partkey, sum(ps_supplycost * ps_availqty) as value1   from partsupp, nation     where n_regionkey in              (select r_regionkey from region where r_name='ASIA') and           (ps_suppkey, n_nationkey) in               (select s_suppkey, s_nationkey from supplier where s_acctbal > 5000) group by n_name, ps_partkey   having sum(ps_supplycost * ps_availqty) >          (            select sum(ps_supplycost * ps_availqty) * 0.0001/10              from partsupp, nation                where n_regionkey in                         (select r_regionkey from region where r_name='ASIA') and                      (ps_suppkey, n_nationkey) in                         (select s_suppkey, s_nationkey from supplier                            where s_acctbal > 5000)          ) order by value1 desc limit 5;
+-----------+------------+-------------+
| n_name    | ps_partkey | value1      |
+-----------+------------+-------------+
| INDONESIA |     900681 | 19682309.49 | 
| CHINA     |     519225 | 18978987.84 | 
| JAPAN     |    1951485 | 18962220.53 | 
| CHINA     |    1473516 | 18942970.00 | 
| JAPAN     |     178220 | 18604710.90 | 
+-----------+------------+-------------+
5 rows in set (1 min 21.16 sec)

mysql> set join_cache_level=6;
Query OK, 0 rows affected (0.00 sec)

mysql> set join_buffer_size=1024;
Query OK, 0 rows affected (0.00 sec)

mysql> select n_name, ps_partkey, sum(ps_supplycost * ps_availqty) as value1   from partsupp, nation     where n_regionkey in              (select r_regionkey from region where r_name='ASIA') and           (ps_suppkey, n_nationkey) in               (select s_suppkey, s_nationkey from supplier where s_acctbal > 5000) group by n_name, ps_partkey   having sum(ps_supplycost * ps_availqty) >          (            select sum(ps_supplycost * ps_availqty) * 0.0001/10              from partsupp, nation                where n_regionkey in                         (select r_regionkey from region where r_name='ASIA') and                      (ps_suppkey, n_nationkey) in                         (select s_suppkey, s_nationkey from supplier                            where s_acctbal > 5000)          ) order by value1 desc limit 5;
+---------+------------+------------+
| n_name  | ps_partkey | value1     |
+---------+------------+------------+
| VIETNAM |     549154 | 9866208.80 | 
| VIETNAM |      24119 | 9569399.58 | 
| VIETNAM |     897421 | 9486971.32 | 
| VIETNAM |      48927 | 9426352.34 | 
| VIETNAM |     349971 | 9335125.80 | 
+---------+------------+------------+
5 rows in set (0.16 sec)

mysql> set join_buffer_size=1024*128;
Query OK, 0 rows affected (0.00 sec)

mysql> select n_name, ps_partkey, sum(ps_supplycost * ps_availqty) as value1   from partsupp, nation     where n_regionkey in              (select r_regionkey from region where r_name='ASIA') and           (ps_suppkey, n_nationkey) in               (select s_suppkey, s_nationkey from supplier where s_acctbal > 5000) group by n_name, ps_partkey   having sum(ps_supplycost * ps_availqty) >          (            select sum(ps_supplycost * ps_availqty) * 0.0001/10              from partsupp, nation                where n_regionkey in                         (select r_regionkey from region where r_name='ASIA') and                      (ps_suppkey, n_nationkey) in                         (select s_suppkey, s_nationkey from supplier                            where s_acctbal > 5000)          ) order by value1 desc limit 5;
+-----------+------------+-------------+
| n_name    | ps_partkey | value1      |
+-----------+------------+-------------+
| INDONESIA |     900681 | 19682309.49 | 
| CHINA     |     519225 | 18978987.84 | 
| JAPAN     |    1951485 | 18962220.53 | 
| CHINA     |    1473516 | 18942970.00 | 
| JAPAN     |     178220 | 18604710.90 | 
+-----------+------------+-------------+
5 rows in set (1 min 28.00 sec)
[7 Dec 2008 20:27] Igor Babaev
In debug version we an assertion abort instead of wrong results.

The bug can be reproduced with a simple test case as follows:

mysql> CREATE TABLE t1 (a int);
Query OK, 0 rows affected (0.07 sec)

mysql>
mysql> INSERT INTO t1 VALUES (0);
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO t1(a) SELECT a FROM t1;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t1(a) SELECT a FROM t1;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t1(a) SELECT a FROM t1;
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t1(a) SELECT a FROM t1;
Query OK, 8 rows affected (0.00 sec)
Records: 8  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t1(a) SELECT a FROM t1;
Query OK, 16 rows affected (0.00 sec)
Records: 16  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t1(a) SELECT a FROM t1;
Query OK, 32 rows affected (0.00 sec)
Records: 32  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t1(a) SELECT a FROM t1;
Query OK, 64 rows affected (0.00 sec)
Records: 64  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t1(a) SELECT a FROM t1;
Query OK, 128 rows affected (0.01 sec)
Records: 128  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t1(a) SELECT a FROM t1;
Query OK, 256 rows affected (0.00 sec)
Records: 256  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t1(a) SELECT a FROM t1;
Query OK, 512 rows affected (0.01 sec)
Records: 512  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t1(a) SELECT a FROM t1;
Query OK, 1024 rows affected (0.01 sec)
Records: 1024  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t1 VALUES (20000), (10000);
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql>
mysql> CREATE TABLE t2 (pk int AUTO_INCREMENT PRIMARY KEY, b int, c int, INDEX idx(b));
Query OK, 0 rows affected (0.02 sec)

mysql> INSERT INTO t2(b,c) VALUES (10000, 3), (20000, 7), (20000, 1), (10000, 9), (20000, 5);
Query OK, 5 rows affected (0.01 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t2(b,c) SELECT b,c FROM t2;
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t2(b,c) SELECT b,c FROM t2;
Query OK, 10 rows affected (0.00 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t2(b,c) SELECT b,c FROM t2;
Query OK, 20 rows affected (0.00 sec)
Records: 20  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t2(b,c) SELECT b,c FROM t2;
Query OK, 40 rows affected (0.01 sec)
Records: 40  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t2(b,c) SELECT b,c FROM t2;
Query OK, 80 rows affected (0.01 sec)
Records: 80  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t2(b,c) SELECT b,c FROM t2;
Query OK, 160 rows affected (0.02 sec)
Records: 160  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t2(b,c) SELECT b,c FROM t2;
Query OK, 320 rows affected (0.04 sec)
Records: 320  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t2(b,c) SELECT b,c FROM t2;
Query OK, 640 rows affected (0.09 sec)
Records: 640  Duplicates: 0  Warnings: 0

mysql>
mysql> ANALYZE TABLE t1,t2;
+---------+---------+----------+----------+
| Table   | Op      | Msg_type | Msg_text |
+---------+---------+----------+----------+
| test.t1 | analyze | status   | OK       |
| test.t2 | analyze | status   | OK       |
+---------+---------+----------+----------+
2 rows in set (0.00 sec)

mysql>
mysql> set join_cache_level=6;
Query OK, 0 rows affected (0.00 sec)

mysql> set join_buffer_size=1024;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql> EXPLAIN SELECT AVG(c) FROM t1,t2 WHERE t1.a=t2.b;
+----+-------------+-------+------+---------------+------+---------+-----------+------+-------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref       | rows | Extra             |
+----+-------------+-------+------+---------------+------+---------+-----------+------+-------------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL      | 2050 |                   |
|  1 | SIMPLE      | t2    | ref  | idx           | idx  | 5       | test.t1.a |  640 | Using join buffer |
+----+-------------+-------+------+---------------+------+---------+-----------+------+-------------------+
2 rows in set (0.01 sec)

mysql> SELECT AVG(c) FROM t1,t2 WHERE t1.a=t2.b;
ERROR 2013 (HY000): Lost connection to MySQL server during query
[7 Dec 2008 20:42] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/60837

2669 Igor Babaev	2008-12-07
      Fixed bug #41204.
      An assertion abort occurred for a join query when a small size of 
      the join buffer was set and the value of record_per_key for the
      index used for a ref access with this join buffer was big enough.
[14 Dec 2008 11:07] Bugs System
Pushed into 6.0.8-alpha  (revid:igor@mysql.com-20081207204424-5zlp32e02cr3ophk) (version source revid:igor@mysql.com-20081207204424-5zlp32e02cr3ophk) (pib:5)
[14 Jan 2009 0:34] Paul DuBois
Correction: This is pushed into 6.0.9.
[14 Jan 2009 1:05] Paul DuBois
Noted in 6.0.9 changelog.

An assertion failure occurred for a join query when a small size of
the join buffer was set and the value of record_per_key for the index
used for a ref access with this join buffer was big enough.
[16 Aug 2010 6:42] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:27] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[22 Nov 2010 0:32] Paul DuBois
Noted in 5.6.1 changelog.
[23 Nov 2010 2:16] Paul DuBois
Correction: No 5.6.1 changelog entry. Bug does not appear in any released 5.6.x version.