Bug #9509 Optimizer: wrong result after AND with latin1_german2_ci comparisons
Submitted: 31 Mar 2005 2:05 Modified: 21 Sep 2006 2:02
Reporter: Peter Gulutzan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:5.0.4-beta-debug OS:Linux (SUSE 9.2)
Assigned to: Alexander Barkov CPU Architecture:Any

[31 Mar 2005 2:05] Peter Gulutzan
Description:
When I AND a TRUE expression with a TRUE expression, the result should be TRUE.
But when one of the expressions is "LENGTH(opd)=1" and the other expression is
"opd=two-character-literal", MySQL decides it's an impossible WHERE clause.
(I can see this with EXPLAIN.) So the result is wrong when I use latin1_german2_ci.

How to repeat:
mysql> create table ti (s1 char(5) character set latin1 collate latin1_german2_ci);
Query OK, 0 rows affected (0.01 sec)

mysql> insert into ti values (0xf6) /* this is o-umlaut */;
Query OK, 1 row affected (0.00 sec)

mysql> select * from ti where length(s1)=1;
+------+
| s1   |
+------+
| ö   |
+------+
1 row in set (0.00 sec)

mysql> select * from ti where s1='oe';
+------+
| s1   |
+------+
| ö   |
+------+
1 row in set (0.00 sec)

mysql> select * from ti where length(s1)=1 and s1='oe';
Empty set (0.00 sec)
[17 Feb 2006 14:57] Peter Gulutzan
I get wrong results with Lithuanian and Czech too.
I do not claim it's always the same bug, but I hope
that the fixer of bug#9509 will check these too.

mysql> create table tli (s1 char(5) character set
utf8 collate utf8_lithuanian_ci);
Query OK, 0 rows affected (0.01 sec)

mysql> insert into tli values ('I'),('K'),('Y');
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

/* although 'Y' < 'K' in Lithuanian, I get nothing when
I look for a 'Y' */

mysql> select * from tli where s1 < 'K' and s1 = 'Y';
Empty set (0.01 sec)

mysql> create table tcz (s1 char(5) character set utf8 collate
utf8_czech_ci);
Query OK, 0 rows affected (0.00 sec)

mysql> insert into tcz values
('c'),('d'),('h'),('ch'),('CH'),('cH'),('Ch'),('i');
Query OK, 8 rows affected (0.00 sec)
Records: 8  Duplicates: 0  Warnings: 0

/* although 'CH' > 'd', and 'CH' = 'CH',
I get nothing when I use both conditions, ANDed. */

mysql> select * from tcz where s1 > 'd' and s1 = 'CH';
Empty set (0.00 sec)

/* although 'cH' <> 'ch', and s1 = 'cH',
I get nothing when I use both conditions, ANDed. */

mysql> select * from tcz where s1 = 'cH' and s1 <> 'ch';
Empty set (0.00 sec)
[20 Apr 2006 10:16] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/5199
[16 Aug 2006 11:22] Alexander Barkov
Pushed into 4.1.22

TODO: merge to 5.0 and 5.1
[2 Sep 2006 9:14] Timothy Smith
Merged to 5.0 (will be in 5.0.25).

TODO: merge to 5.1
[7 Sep 2006 0:10] Jon Stephens
This isn't a showstopper, so I've set the status to patch pending until the fix is merged to 5.1.
[13 Sep 2006 7:40] Timothy Smith
Pushed to 5.1.12.
[21 Sep 2006 2:02] Paul DuBois
Noted in 4.1.22, 5.0.25, 5.1.12 changelogs.

The optimizer could produce an incorrect result after AND with
collations such as latin1_german2_ci, utf8_czech_ci, and
utf8_lithianian_ci.