Bug #18206 first key part using prefix of utf8 varchar field yeilds empty set sometimes
Submitted: 14 Mar 2006 1:48 Modified: 7 Jul 2006 20:31
Reporter: Timothy Smith Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S1 (Critical)
Version:4.1.19, 5.0.18 OS:Linux (linux,freebsd,any)
Assigned to: Igor Babaev CPU Architecture:Any

[14 Mar 2006 1:48] Timothy Smith
Description:
This query produces (correctly) a single row:

select date_created from jen_test where owner_id = 'dbe48e59-6ff8-4f04-a085-d5814984ed69' ;

This query (wrongly) produces the empty set:

select max(date_created) from jen_test where owner_id = 'dbe48e59-6ff8-4f04-a085-d5814984ed69' ;

I got the same results with MIN() instead of MAX(); using SUM()
produces the correct result, though.

When the empty set is produced, the following warning is given:

| Warning | 1265 | Data truncated for column 'OWNER_ID' at row 1 |

Why is OWNER_ID truncated?

See HOW TO REPEAT for full test case.

How to repeat:

This was tested on current 4.1.19-bk code, and a recent 5.0.18 build, on FreeBSD and Linux.

drop table if exists jen_test;

CREATE TABLE jen_test ( `ID` varchar(40) NOT NULL default '', `OWNER_ID` varchar(40) NOT NULL default '', `DATE_CREATED` datetime default NULL, PRIMARY KEY (`ID`), KEY `posting_date_created` (`DATE_CREATED`), KEY `posting_owner_id_date_created_idx` (`OWNER_ID`(8),`DATE_CREATED`) ) engine=MyISAM DEFAULT CHARSET=utf8 ;

insert into jen_test values ('c8cee030-ea96-4aa7-8119-894b19bbf162', 'dbe48e59-6ff8-4f04-a085-d5814984ed69', '2005-07-26 20:25:32' ) ;

select date_created from jen_test where owner_id = 'dbe48e59-6ff8-4f04-a085-d5814984ed69' ;

select max(date_created) from jen_test where owner_id = 'dbe48e59-6ff8-4f04-a085-d5814984ed69' ;

show warnings;

select version();

I get the following results:

mysql> drop table if exists jen_test;
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE jen_test ( `ID` varchar(40) NOT NULL default '', `OWNER_ID` varchar(40) NOT NULL default '', `DATE_CREATED` datetime default NULL, PRIMARY KEY (`ID`), KEY `posting_date_created` (`DATE_CREATED`), KEY `posting_owner_id_date_created_idx` (`OWNER_ID`(8),`DATE_CREATED`) ) engine=MyISAM DEFAULT CHARSET=utf8 ;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into jen_test values ('c8cee030-ea96-4aa7-8119-894b19bbf162', 'dbe48e59-6ff8-4f04-a085-d5814984ed69', '2005-07-26 20:25:32' ) ;
Query OK, 1 row affected (0.00 sec)

mysql> select date_created from jen_test where owner_id = 'dbe48e59-6ff8-4f04-a085-d5814984ed69' ;
+---------------------+
| date_created        |
+---------------------+
| 2005-07-26 20:25:32 |
+---------------------+
1 row in set (0.00 sec)

mysql> select max(date_created) from jen_test where owner_id = 'dbe48e59-6ff8-4f04-a085-d5814984ed69' ;
Empty set, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+-----------------------------------------------+
| Level   | Code | Message                                       |
+---------+------+-----------------------------------------------+
| Warning | 1265 | Data truncated for column 'OWNER_ID' at row 1 |
+---------+------+-----------------------------------------------+
1 row in set (0.00 sec)

mysql> select version();
+------------------+
| version()        |
+------------------+
| 4.1.19-debug-log |
+------------------+
1 row in set (0.00 sec)

Suggested fix:

When I change the CREATE table to index the entire OWNER_ID column, then it does not fail:

mysql> CREATE TABLE jen_test ( `ID` varchar(40) NOT NULL default '', `OWNER_ID` varchar(40) NOT NULL default '', `DATE_CREATED` datetime default NULL, PRIMARY KEY (`ID`), KEY `posting_date_created` (`DATE_CREATED`), KEY `posting_owner_id_date_created_idx` (`OWNER_ID`,`DATE_CREATED`) ) engine=MyISAM DEFAULT CHARSET=utf8 ;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into jen_test values ('c8cee030-ea96-4aa7-8119-894b19bbf162', 'dbe48e59-6ff8-4f04-a085-d5814984ed69', '2005-07-26 20:25:32' ) ;
Query OK, 1 row affected (0.00 sec)

mysql> select date_created from jen_test where owner_id = 'dbe48e59-6ff8-4f04-a085-d5814984ed69' ;
+---------------------+
| date_created        |
+---------------------+
| 2005-07-26 20:25:32 |
+---------------------+
1 row in set (0.00 sec)

mysql> select max(date_created) from jen_test where owner_id = 'dbe48e59-6ff8-4f04-a085-d5814984ed69' ;
+---------------------+
| max(date_created)   |
+---------------------+
| 2005-07-26 20:25:32 |
+---------------------+
1 row in set (0.00 sec)

Likewise, the following CREATE shows *correct* behavior (1 row returned from the query):

CREATE TABLE jen_test ( `ID` varchar(40) NOT NULL default '', `OWNER_ID` varchar(40) NOT NULL default '', `DATE_CREATED` datetime default NULL, PRIMARY KEY (`ID`), KEY `posting_date_created` (`DATE_CREATED`), KEY `posting_owner_id_date_created_idx` (`OWNER_ID`(8)) ) engine=MyISAM DEFAULT CHARSET=utf8 ;

So the problem appears to be only with a *prefix* of an *initial key part*, on a UTF-8 column.

HOWEVER, I *do* still see a warning when using a latin1 column instead of UTF-8; it produces the correct result, but still gives the same warning:

mysql> CREATE TABLE jen_test ( `ID` varchar(40) NOT NULL default '', `OWNER_ID` varchar(40) NOT NULL default '', `DATE_CREATED` datetime default NULL, PRIMARY KEY (`ID`), KEY `posting_date_created` (`DATE_CREATED`), KEY `posting_owner_id_date_created_idx` (`OWNER_ID`(8),`DATE_CREATED`) ) engine=MyISAM;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into jen_test values ('c8cee030-ea96-4aa7-8119-894b19bbf162', 'dbe48e59-6ff8-4f04-a085-d5814984ed69', '2005-07-26 20:25:32' ) ;
Query OK, 1 row affected (0.00 sec)

mysql> select max(date_created) from jen_test where owner_id = 'dbe48e59-6ff8-4f04-a085-d5814984ed69' ;
+---------------------+
| max(date_created)   |
+---------------------+
| 2005-07-26 20:25:32 |
+---------------------+
1 row in set, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+-----------------------------------------------+
| Level   | Code | Message                                       |
+---------+------+-----------------------------------------------+
| Warning | 1265 | Data truncated for column 'OWNER_ID' at row 1 |
+---------+------+-----------------------------------------------+
1 row in set (0.00 sec)

mysql> show create table jen_test;
+----------+--------------------------------------------------------------------
| Table    | Create Table                                                       
+----------+--------------------------------------------------------------------
| jen_test | CREATE TABLE `jen_test` (
  `ID` varchar(40) NOT NULL default '',
  `OWNER_ID` varchar(40) NOT NULL default '',
  `DATE_CREATED` datetime default NULL,
  PRIMARY KEY  (`ID`),
  KEY `posting_date_created` (`DATE_CREATED`),
  KEY `posting_owner_id_date_created_idx` (`OWNER_ID`(8),`DATE_CREATED`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
[31 May 2006 2:54] Igor Babaev
There are two problems here.
The first one can be reproduced without using partial index:

mysql> CREATE TABLE t1 (id int PRIMARY KEY, b char(3), INDEX
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO t1 VALUES (1,'xx'), (2,'aa');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM t1;
+----+------+
| id | b    |
+----+------+
|  1 | xx   |
|  2 | aa   |
+----+------+
2 rows in set (0.00 sec)

mysql> SELECT MAX(b) FROM t1 WHERE b < 'ppppp';
+--------+
| MAX(b) |
+--------+
| aa     |
+--------+
1 row in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;
+---------+------+----------------------------------------+
| Level   | Code | Message                                |
+---------+------+----------------------------------------+
| Warning | 1265 | Data truncated for column 'b' at row 1 |
+---------+------+----------------------------------------+
1 row in set (0.01 sec)

mysql> SELECT MAX(b) FROM t1 WHERE b < 'pp';
+--------+
| MAX(b) |
+--------+
| aa     |
+--------+
1 row in set (0.00 sec)

As we can see we have an unexpected warning when the key in the comparison predicate of the WHERE clause that set a condition for field b is longer than the length of this field.

The second problem can be observed when we use a partial index:

mysql> CREATE TABLE t1 (id int PRIMARY KEY, b char(16), INDEX(b(4)));
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO t1 VALUES (1, 'xxxxbbbb'), (2, 'xxxxaaaa');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT MAX(b) FROM t1;
+----------+
| MAX(b)   |
+----------+
| xxxxaaaa |
+----------+
1 row in set (0.00 sec)

The fact is that MIN/MAX optimization is applied here:

mysql> EXPLAIN SELECT MAX(b) FROM t1;
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL |    NULL | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
1 row in set (0.00 sec)

Yet this optimization cannot be applied to a partial index.
[31 May 2006 2:55] Igor Babaev
There are two problems here.
The first one can be reproduced without using partial index:

mysql> CREATE TABLE t1 (id int PRIMARY KEY, b char(3), INDEX
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO t1 VALUES (1,'xx'), (2,'aa');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM t1;
+----+------+
| id | b    |
+----+------+
|  1 | xx   |
|  2 | aa   |
+----+------+
2 rows in set (0.00 sec)

mysql> SELECT MAX(b) FROM t1 WHERE b < 'ppppp';
+--------+
| MAX(b) |
+--------+
| aa     |
+--------+
1 row in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;
+---------+------+----------------------------------------+
| Level   | Code | Message                                |
+---------+------+----------------------------------------+
| Warning | 1265 | Data truncated for column 'b' at row 1 |
+---------+------+----------------------------------------+
1 row in set (0.01 sec)

mysql> SELECT MAX(b) FROM t1 WHERE b < 'pp';
+--------+
| MAX(b) |
+--------+
| aa     |
+--------+
1 row in set (0.00 sec)

As we can see we have an unexpected warning when the key in the comparison predicate of the WHERE clause that set a condition for field b is longer than the length of this field.

The second problem can be observed when we use a partial index:

mysql> CREATE TABLE t1 (id int PRIMARY KEY, b char(16), INDEX(b(4)));
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO t1 VALUES (1, 'xxxxbbbb'), (2, 'xxxxaaaa');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT MAX(b) FROM t1;
+----------+
| MAX(b)   |
+----------+
| xxxxaaaa |
+----------+
1 row in set (0.00 sec)

The fact is that MIN/MAX optimization is applied here:

mysql> EXPLAIN SELECT MAX(b) FROM t1;
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL |    NULL | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
1 row in set (0.00 sec)

Yet this optimization cannot be applied to a partial index.
[2 Jun 2006 21:15] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7232
[14 Jun 2006 5:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7597
[19 Jun 2006 0:12] Evgeny Potemkin
Fixed in 4.1.21, 5.0.23, 5.1.12
[20 Jun 2006 10:59] Evgeny Potemkin
The bug report revealed two problems related to min/max optimization:
1. If the length of a constant key used in a SARGable condition for
for the MIN/MAX fields is greater than the length of the field an 
unwanted warning on key truncation is issued;
2. If MIN/MAX optimization is applied to a partial index, like INDEX(b(4))
than can lead to returning a wrong result set.
[7 Jul 2006 20:31] Mike Hillyer
Documented in 4.0.21, 5.0.23, 5.1.12 changelogs.