Description:
When the table is bigger and the query needs to merge multiple chunk and
at the same time needs to remove the duplicate keys, then the result is
wrong because the duplicate keys can't be removed. Because the function
key_is_greater_than is wrong for the varchar or blob.
If the function cmp_varlen_keys returns true, it means that the key1 is
less than the key2. Otherwise, it means that the key1 is greater than or
equal with the key2. So if the key_is_greater_than for varlen_keys returns
true, it means that the key1 is greater than or equal with the key2. Then
it's wrong using !mcl.key_is_greater_than to determine if the key is equal
with the key2 in the function merge_buffers.
How to repeat:
CREATE TABLE t2 (
auto int not null auto_increment,
fld1 int(6) unsigned zerofill DEFAULT '000000' NOT NULL,
companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
fld3 char(30) DEFAULT '' NOT NULL,
fld4 char(35) DEFAULT '' NOT NULL,
fld5 char(35) DEFAULT '' NOT NULL,
fld6 char(4) DEFAULT '' NOT NULL,
UNIQUE fld1 (fld1),
KEY fld3 (fld3),
PRIMARY KEY (auto)
);
INSERT INTO t2 VALUES (1,000001,00,'Omaha','teethe','neat','');
INSERT INTO t2 VALUES (2,011401,37,'breaking','dreaded','Steinberg','W');
INSERT INTO t2 VALUES (3,011402,37,'Romans','scholastics','jarring','');
INSERT INTO t2 VALUES (4,011403,37,'intercepted','audiology','tinily','');
INSERT INTO t2 VALUES (5,011501,37,'bewilderingly','wallet','balled','');
INSERT INTO t2 VALUES (6,011701,37,'astound','parters','persist','W');
INSERT INTO t2 VALUES (7,011702,37,'admonishing','eschew','attainments','');
INSERT INTO t2 VALUES (8,011703,37,'sumac','quitter','fanatic','');
INSERT INTO t2 VALUES (9,012001,37,'flanking','neat','measures','FAS');
....
+SET @@sort_buffer_size = 32768;
+select count(*) from (select distinct fld3,repeat("a",length(fld3)*5),count(*) from t2 group by companynr,fld3 order by fld3) t;
+count(*)
+1188
+SET @@sort_buffer_size = 327680;
+select count(*) from (select distinct fld3,repeat("a",length(fld3)*5),count(*) from t2 group by companynr,fld3 order by fld3) t;
+count(*)
+1171
Suggested fix:
--- a/sql/filesort.cc
+++ b/sql/filesort.cc
@@ -1776,8 +1776,8 @@ struct Merge_chunk_greater {
if (m_len) return memcmp(key1, key2, m_len) > 0;
if (m_param)
- return !cmp_varlen_keys(m_param->local_sortorder, m_param->use_hash, key1,
- key2);
+ return cmp_varlen_keys(m_param->local_sortorder, m_param->use_hash, key2,
+ key1);
Description: When the table is bigger and the query needs to merge multiple chunk and at the same time needs to remove the duplicate keys, then the result is wrong because the duplicate keys can't be removed. Because the function key_is_greater_than is wrong for the varchar or blob. If the function cmp_varlen_keys returns true, it means that the key1 is less than the key2. Otherwise, it means that the key1 is greater than or equal with the key2. So if the key_is_greater_than for varlen_keys returns true, it means that the key1 is greater than or equal with the key2. Then it's wrong using !mcl.key_is_greater_than to determine if the key is equal with the key2 in the function merge_buffers. How to repeat: CREATE TABLE t2 ( auto int not null auto_increment, fld1 int(6) unsigned zerofill DEFAULT '000000' NOT NULL, companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL, fld3 char(30) DEFAULT '' NOT NULL, fld4 char(35) DEFAULT '' NOT NULL, fld5 char(35) DEFAULT '' NOT NULL, fld6 char(4) DEFAULT '' NOT NULL, UNIQUE fld1 (fld1), KEY fld3 (fld3), PRIMARY KEY (auto) ); INSERT INTO t2 VALUES (1,000001,00,'Omaha','teethe','neat',''); INSERT INTO t2 VALUES (2,011401,37,'breaking','dreaded','Steinberg','W'); INSERT INTO t2 VALUES (3,011402,37,'Romans','scholastics','jarring',''); INSERT INTO t2 VALUES (4,011403,37,'intercepted','audiology','tinily',''); INSERT INTO t2 VALUES (5,011501,37,'bewilderingly','wallet','balled',''); INSERT INTO t2 VALUES (6,011701,37,'astound','parters','persist','W'); INSERT INTO t2 VALUES (7,011702,37,'admonishing','eschew','attainments',''); INSERT INTO t2 VALUES (8,011703,37,'sumac','quitter','fanatic',''); INSERT INTO t2 VALUES (9,012001,37,'flanking','neat','measures','FAS'); .... +SET @@sort_buffer_size = 32768; +select count(*) from (select distinct fld3,repeat("a",length(fld3)*5),count(*) from t2 group by companynr,fld3 order by fld3) t; +count(*) +1188 +SET @@sort_buffer_size = 327680; +select count(*) from (select distinct fld3,repeat("a",length(fld3)*5),count(*) from t2 group by companynr,fld3 order by fld3) t; +count(*) +1171 Suggested fix: --- a/sql/filesort.cc +++ b/sql/filesort.cc @@ -1776,8 +1776,8 @@ struct Merge_chunk_greater { if (m_len) return memcmp(key1, key2, m_len) > 0; if (m_param) - return !cmp_varlen_keys(m_param->local_sortorder, m_param->use_hash, key1, - key2); + return cmp_varlen_keys(m_param->local_sortorder, m_param->use_hash, key2, + key1);