Bug #48447 Character sets: COLLATE clause mishandled
Submitted: 30 Oct 2009 21:58 Modified: 13 Nov 2009 2:05
Reporter: Peter Gulutzan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.0, 5.1.41, 5.5 OS:Linux (SUSE 11.1 64-bit)
Assigned to: Mikael Ronström CPU Architecture:Any

[30 Oct 2009 21:58] Peter Gulutzan
Description:
Following a literal, the COLLATE clause is mishandled if an index exists.
I get different results depending whether I use an index or not.

This is related to
Bug#48161 "Partitions: search fails with non-default collation"
where Mikael explains that it seems to be an optimizer problem.
But I have initially given it a "Character sets" category because
that's where the effect is.

It is admittedly true that the more correct-looking statement
select * from t1 where a collate latin1_bin > 'B';
produces what I'd expect even if there's an index, but the manual
gives examples showing the COLLATE clause on either side.
http://dev.mysql.com/doc/refman/5.4/en/charset-collate.html

How to repeat:
drop table if exists t1;
set names latin1;
create table t1 (a char(1) character set latin1 collate latin1_general_ci);
insert into t1 values ('a'),('A'),('b'),('B'),('c'),('C');
SELECT * from t1 where a > 'B' collate latin1_bin;
create index i on t1 (a);
SELECT * from t1 where a > 'B' collate latin1_bin;

/*
You'll see that the first SELECT's result set is different from the
second SELECT's result set. This affects Mikael's attempt to get
the correct value with partitioning.

Alexander Barkov tried this script with Betony and 5.0 and verified the
problem in both.

*/
[30 Oct 2009 22:27] Peter Laursen
I cannot reproduce with 5.0.87 and 5.0.40 servers on Windows.  I get 

a     
------
a     
b     
c     
C     

in all cases.  Maybe a latin1 issue with Linux?
[31 Oct 2009 8:33] Valeriy Kravchuk
Verified on Mac OS X:

77-52-222-60:5.1 openxs$ bin/mysql test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.41-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> drop table if exists t1;
Query OK, 0 rows affected (0.01 sec)

mysql> set names latin1;
Query OK, 0 rows affected (0.00 sec)

mysql> create table t1 (a char(1) character set latin1 collate latin1_general_ci);
Query OK, 0 rows affected (0.08 sec)

mysql> insert into t1 values ('a'),('A'),('b'),('B'),('c'),('C');
Query OK, 6 rows affected (0.00 sec)
Records: 6  Duplicates: 0  Warnings: 0

mysql> SELECT * from t1 where a > 'B' collate latin1_bin;
+------+
| a    |
+------+
| a    |
| b    |
| c    |
| C    |
+------+
4 rows in set (0.00 sec)

mysql> create index i on t1 (a);
Query OK, 6 rows affected (0.19 sec)
Records: 6  Duplicates: 0  Warnings: 0

mysql> SELECT * from t1 where a > 'B' collate latin1_bin;
+------+
| a    |
+------+
| c    |
| C    |
+------+
2 rows in set (0.00 sec)
[5 Nov 2009 14:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/89482

2924 Mikael Ronstrom	2009-11-05
      BUG#48447, BUG#48161, fixed a regression from fix of BUG#6045, where binary collations can use indexes/partition pruning for cases using equality conditions, however it cannot be used for any other condition like <, >, <=, >=, <>, also added test case for verification of BUG#47774 in this patch
[5 Nov 2009 14:47] Mikael Ronström
Same fix that fixes BUG#48161, also fixes this one,
patch has test case for both this bug and this bug in it.
[6 Nov 2009 6:48] Alexander Barkov
Hi Mikael, the patch http://lists.mysql.com/commits/89482 looks 
generally fine for me.

Before you push, can you please move this part of the test:

+# Test case from BUG#48447 with some extension
+create table t1 (a varchar(1) character set latin1 collate latin1_general_ci);
+insert into t1 values ('A'),('a'),('B'),('b'),('C'),('c');
+select * from t1 where a > 'B' collate latin1_bin;
+select * from t1 where a <> 'B' collate latin1_bin;
+create index i on t1 (a);
+select * from t1 where a > 'B' collate latin1_bin;
+select * from t1 where a <> 'B' collate latin1_bin;

into ctype_collate.test, after the block starting with this comment:

#
# Test that optimizer doesn't use indexes with wrong collation
#
[6 Nov 2009 10:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/89569

2925 Mikael Ronstrom	2009-11-06
      Review fixes for BUG#48161 and BUG#48447
[6 Nov 2009 22:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/89682

2915 Mikael Ronstrom	2009-11-06 [merge]
      Merge BUG#48161, BUG#48447
[11 Nov 2009 6:48] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091110093407-rw5g8dys2baqkt67) (version source revid:alik@sun.com-20091110083317-h00x61ugz9fxhdod) (merge vers: 6.0.14-alpha) (pib:13)
[12 Nov 2009 8:15] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091110093229-0bh5hix780cyeicl) (version source revid:alik@sun.com-20091110083426-bm3am5445pfrrci9) (merge vers: 5.5.0-beta) (pib:13)
[12 Nov 2009 18:27] Paul DuBois
Per Mikael, should be Documenting status.
[13 Nov 2009 2:05] Paul DuBois
Noted in 5.5.0, 6.0.14 changelogs.

Following a literal, the COLLATE clause was mishandled such that
different results can be produced depending whether an index is used.