MySQL Bugs: #25798: Incorrect results with the index merge union access algorithm and InnoDB tables

Bug #25798	Incorrect results with the index merge union access algorithm and InnoDB tables
Submitted:	23 Jan 2007 19:19	Modified:	9 Jul 2007 0:58
Reporter:	Didier Neisen	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server	Severity:	S2 (Serious)
Version:	5.0.33, 5.0bk, 5.1bk	OS:	Linux (Linux)
Assigned to:	Igor Babaev	CPU Architecture:	Any
Tags:	innodb, Optimizer

Description:
The index merge union access algorithm could produce wrong results with InnoDB tables. This bug disappeared with the MyISAM storage engine.

The same query could produce differents results on MySQL 4.X and 5.0 (see the query below).

I did the test with MySQL 5.0.22, 5.0.26 and 5.0.33. They are all affected by this bug.

How to repeat:
CREATE TABLE `test` (
  `id` int(11) NOT NULL auto_increment,
  `b` int(11) NOT NULL,
  `c` datetime NOT NULL,
  PRIMARY KEY  (`id`),
  KEY `b` (`b`),
  KEY `c` (`c`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

I use the following perl script to load the database :

#! /usr/bin/perl

use strict;

foreach my $i (1..50000) {
    print "INSERT INTO test (b,c) VALUES ($i, '2007-01-01');\n";
}

foreach my $i (50001..100000) {
    print "INSERT INTO test (b,c) VALUES ($i, '2007-01-02');\n";
}

foreach my $i (100001..150000) {
    print "INSERT INTO test (b,c) VALUES ($i, '2007-01-03');\n";
}

* Without the index merge union access algorithm

mysql> EXPLAIN SELECT SQL_NO_CACHE COUNT(*) FROM test WHERE (c >= '2007-01-02' AND c <= '2007-01-03') OR b >= 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
         type: ALL
possible_keys: b,c
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 150533
        Extra: Using where
1 row in set (0.00 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM test WHERE (c >= '2007-01-02' AND c <= '2007-01-03') OR b >= 1;
+----------+
| COUNT(*) |
+----------+
|   150000 |
+----------+
1 row in set (0.50 sec)

* With the index merge union access algorithm

mysql> EXPLAIN SELECT SQL_NO_CACHE COUNT(*) FROM test force index (b,c) WHERE (c >= '2007-01-02' AND c <= '2007-01-03') OR b >= 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
         type: index_merge
possible_keys: b,c
          key: c,b
      key_len: 8,4
          ref: NULL
         rows: 150532
        Extra: Using sort_union(c,b); Using where
1 row in set (0.09 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM test force index (b,c) WHERE (c >= '2007-01-02' AND c <= '2007-01-03') OR b >= 1;
+----------+
| COUNT(*) |
+----------+
|   250000 |
+----------+
1 row in set (4.17 sec)

I use the 'force index' hint in order to force the optimizer to choose the sort_union algorithm. With some workload this algorithm could be choosed without the need to use the 'force index' hint.

When i switch to MyISAM this bug disappeared.

mysql> ALTER TABLE test ENGINE=MyISAM;
Query OK, 150000 rows affected (4.04 sec)
Records: 150000  Duplicates: 0  Warnings: 0

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM test force index (b,c) WHERE (c >= '2007-01-02' AND c <= '2007-01-03') OR b >= 1;
+----------+
| COUNT(*) |
+----------+
|   150000 |
+----------+
1 row in set (2.32 sec)

Suggested fix:
I don't know if this is an InnoDB or optimizer bug.

mysqltest test case

Attachment: bug25798.tgz (application/x-gtar, text), 983 bytes.

I assign this difficult bug to Inaam. The task is to find out if it is the MySQL SQL interpreter that makes the error, or InnoDB.
--Heikki

This is a bug in MySQL code. The merge_buffers() [file: sql/filesort.cc] uses a heap of buffers for merging. This heap is initialized as:

1068   if (init_queue(&queue, (uint) (Tb-Fb)+1, offsetof(BUFFPEK,key), 0,
1069                  (queue_compare) (cmp= get_ptr_compare(sort_length)),
1070                  (void*) &sort_length))

The bug is that MySQL uses standard binary string comparison function. Note that, for example, in InnoDB the value of ref field in the handler is the primary key value in MySQL format. This implies that in case of integer data, ref will hold the integer value in little endian format. Binary string comparisons won't work properly on that data.
A proposed solution is to use handler::cmp_ref() function.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/30046

ChangeSet@1.2513, 2007-07-01 15:33:28-07:00, igor@olga.mysql.com +5 -0
  Fixed bug #25798.
  This bug may manifest itself not only with the queries for which
  the index-merge access method is chosen. It also may display
  itself for queries with DISTINCT.
  
  The bug was in how the Unique::get method used the merge_buffers
  function. To compare elements in the the queue employed by
  merge_buffers() it must use the buffpek_compare function rather
  than the function for binary comparison.

Pushed into 5.1.21-beta

Pushed into 5.0.46

Noted in 5.0.46, 5.1.21 changelogs.

The index merge union access algorithm could produce incorrect
results with InnoDB tables. The problem could also occur for queries
that used DISTINCT.