Bug #25798 Incorrect results with the index merge union access algorithm and InnoDB tables
Submitted: 23 Jan 2007 19:19 Modified: 9 Jul 2007 0:58
Reporter: Didier Neisen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.33, 5.0bk, 5.1bk OS:Linux (Linux)
Assigned to: Igor Babaev CPU Architecture:Any
Tags: innodb, Optimizer

[23 Jan 2007 19:19] Didier Neisen
Description:
The index merge union access algorithm could produce wrong results with InnoDB tables. This bug disappeared with the MyISAM storage engine.

The same query could produce differents results on MySQL 4.X and 5.0 (see the query below).

I did the test with MySQL 5.0.22, 5.0.26 and 5.0.33. They are all affected by this bug.

How to repeat:
CREATE TABLE `test` (
  `id` int(11) NOT NULL auto_increment,
  `b` int(11) NOT NULL,
  `c` datetime NOT NULL,
  PRIMARY KEY  (`id`),
  KEY `b` (`b`),
  KEY `c` (`c`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

I use the following perl script to load the database :

#! /usr/bin/perl

use strict;

foreach my $i (1..50000) {
    print "INSERT INTO test (b,c) VALUES ($i, '2007-01-01');\n";
}

foreach my $i (50001..100000) {
    print "INSERT INTO test (b,c) VALUES ($i, '2007-01-02');\n";
}

foreach my $i (100001..150000) {
    print "INSERT INTO test (b,c) VALUES ($i, '2007-01-03');\n";
}

* Without the index merge union access algorithm

mysql> EXPLAIN SELECT SQL_NO_CACHE COUNT(*) FROM test WHERE (c >= '2007-01-02' AND c <= '2007-01-03') OR b >= 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
         type: ALL
possible_keys: b,c
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 150533
        Extra: Using where
1 row in set (0.00 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM test WHERE (c >= '2007-01-02' AND c <= '2007-01-03') OR b >= 1;
+----------+
| COUNT(*) |
+----------+
|   150000 |
+----------+
1 row in set (0.50 sec)

* With the index merge union access algorithm

mysql> EXPLAIN SELECT SQL_NO_CACHE COUNT(*) FROM test force index (b,c) WHERE (c >= '2007-01-02' AND c <= '2007-01-03') OR b >= 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
         type: index_merge
possible_keys: b,c
          key: c,b
      key_len: 8,4
          ref: NULL
         rows: 150532
        Extra: Using sort_union(c,b); Using where
1 row in set (0.09 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM test force index (b,c) WHERE (c >= '2007-01-02' AND c <= '2007-01-03') OR b >= 1;
+----------+
| COUNT(*) |
+----------+
|   250000 |
+----------+
1 row in set (4.17 sec)

I use the 'force index' hint in order to force the optimizer to choose the sort_union algorithm. With some workload this algorithm could be choosed without the need to use the 'force index' hint.

When i switch to MyISAM this bug disappeared.

mysql> ALTER TABLE test ENGINE=MyISAM;
Query OK, 150000 rows affected (4.04 sec)
Records: 150000  Duplicates: 0  Warnings: 0

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM test force index (b,c) WHERE (c >= '2007-01-02' AND c <= '2007-01-03') OR b >= 1;
+----------+
| COUNT(*) |
+----------+
|   150000 |
+----------+
1 row in set (2.32 sec)

Suggested fix:
I don't know if this is an InnoDB or optimizer bug.
[26 Jan 2007 12:05] Hartmut Holzgraefe
mysqltest test case

Attachment: bug25798.tgz (application/x-gtar, text), 983 bytes.

[22 May 2007 16:11] Heikki Tuuri
I assign this difficult bug to Inaam. The task is to find out if it is the MySQL SQL interpreter that makes the error, or InnoDB.
--Heikki
[4 Jun 2007 22:42] Inaam Rana
This is a bug in MySQL code. The merge_buffers() [file: sql/filesort.cc] uses a heap of buffers for merging. This heap is initialized as:

1068   if (init_queue(&queue, (uint) (Tb-Fb)+1, offsetof(BUFFPEK,key), 0,
1069                  (queue_compare) (cmp= get_ptr_compare(sort_length)),
1070                  (void*) &sort_length))

The bug is that MySQL uses standard binary string comparison function. Note that, for example, in InnoDB the value of ref field in the handler is the primary key value in MySQL format. This implies that in case of integer data, ref will hold the integer value in little endian format. Binary string comparisons won't work properly on that data.
A proposed solution is to use handler::cmp_ref() function.
[1 Jul 2007 22:29] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/30046

ChangeSet@1.2513, 2007-07-01 15:33:28-07:00, igor@olga.mysql.com +5 -0
  Fixed bug #25798.
  This bug may manifest itself not only with the queries for which
  the index-merge access method is chosen. It also may display
  itself for queries with DISTINCT.
  
  The bug was in how the Unique::get method used the merge_buffers
  function. To compare elements in the the queue employed by
  merge_buffers() it must use the buffpek_compare function rather
  than the function for binary comparison.
[8 Jul 2007 17:28] Bugs System
Pushed into 5.1.21-beta
[8 Jul 2007 17:30] Bugs System
Pushed into 5.0.46
[9 Jul 2007 0:58] Paul DuBois
Noted in 5.0.46, 5.1.21 changelogs.

The index merge union access algorithm could produce incorrect
results with InnoDB tables. The problem could also occur for queries
that used DISTINCT.