Bug #44785 @parameters in where clause suboptimal in recent versions of mysql
Submitted: 11 May 2009 12:20 Modified: 26 Jun 2009 12:27
Reporter: Simon Moore Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:5.1.34, 5.1.37-bzr OS:Linux (Cetnos 5.3)
Assigned to: Georgi Kodinov CPU Architecture:Any
Tags: @parameters, explain, functions, Optimizer, performance

[11 May 2009 12:20] Simon Moore
Description:
Using @parameters to a where clause in 5.1.34 and 5.1.33 behaves differently (see explains in example) than in 5.1.31 and significantly negatively effects performance by performing suboptimal queries.

Invesitgaton was carried out due to poor performance of functions in 5.1.33 and problem was isolated further to being the use of parameters (neccesary in functions).

This effects paremeters used in a where clause, where substituted via @parameters or as parameters to a function.

Both using parameters and hardcoded values in a where clause should produce the same result in an explain and take the same time to execute. (You will need a lots more rows in the table to see the performance difference).

How to repeat:

Example.

CREATE TABLE `remote_users` (
  `site_id` int(10) unsigned NOT NULL DEFAULT '1',
  `user_id` int(11) NOT NULL DEFAULT '0',
  `remote_site_id` varchar(20) DEFAULT NULL,
  `remote_user_id` varchar(20) DEFAULT NULL,
  `valid` smallint(6) NOT NULL DEFAULT '0',
  PRIMARY KEY (`site_id`,`user_id`),
  UNIQUE KEY `allofthem` (`site_id`,`user_id`,`remote_site_id`,`remote_user_id`),
  UNIQUE KEY `remote` (`remote_site_id`,`remote_user_id`),
  KEY `remote_site_user` (`site_id`,`remote_site_id`,`remote_user_id`)
) ENGINE=MyISAM;

INSERT INTO remote_users VALUES (135,47619,GB-LUN,90810581,1);
set @my_remote_site_id="GB-LUN"; set @my_remote_user_id="90810581";

EXPLAIN SELECT * FROM remote_users WHERE remote_site_id="GB-LUN" AND remote_user_id="90810581";
+----+-------------+--------------+-------+---------------+--------+---------+-------------+------+-------+
| id | select_type | table        | type  | possible_keys | key    | key_len | ref         | rows | Extra |
+----+-------------+--------------+-------+---------------+--------+---------+-------------+------+-------+
|  1 | SIMPLE      | remote_users | const | remote        | remote | 46      | const,const |    1 |       |
+----+-------------+--------------+-------+---------------+--------+---------+-------------+------+-------+

EXPLAIN SELECT * FROM remote_users WHERE remote_site_id=@my_remote_site_id AND remote_user_id=@my_remote_user_id;
+----+-------------+--------------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table        | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
+----+-------------+--------------+------+---------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | remote_users | ALL  | NULL          | NULL | NULL    | NULL | 3473681 | Using where |
+----+-------------+--------------+------+---------------+------+---------+------+---------+-------------+

 set @my_remote_site_id="GB-LUN"; set @my_remote_user_id="90810581B";
 
 EXPLAIN select * FROM remote_users WHERE remote_site_id="GB-LUN" AND remote_user_id="90810581B";
 +----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+
 | id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                               |
 +----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+
 |  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Impossible WHERE noticed after reading const tables |
 +----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+

EXPLAIN SELECT * FROM remote_users WHERE remote_site_id=@my_remote_site_id AND remote_user_id=@my_remote_user_id;
+----+-------------+--------------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table        | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
+----+-------------+--------------+------+---------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | remote_users | ALL  | NULL          | NULL | NULL    | NULL | 3473681 | Using where |
+----+-------------+--------------+------+---------------+------+---------+------+---------+-------------+

Suggested fix:
Both using parameters and hardcoded values in a where clause should produce the same result in an explain and take the same time to execute.
[11 May 2009 13:04] Simon Moore
This may in fact be an issue with collation.

Table is using latin1
DB is
set collation_connection="utf8_unicode_ci"; set collation_database="utf8_unicode_ci";set collation_server="utf8_unicode_ci";
Changing to
set collation_connection="latin1_swedish_ci"; set collation_database="latin1_swedish_ci";set collation_server="latin1_swedish_ci";

and reruning test results in same results. (Obviously I need to permanently change everything to be in the same collation to work around this performance issue)

No longer sure if this is a bug or intended behaviour.  

Usability wise there do seem to be issues getting everything into the same collation etc and realising when they are not there may be issues. 

Perhaps issue a warning when collations of the variaous bits relevant to a particular query do not match up?
[11 May 2009 14:37] Valeriy Kravchuk
I think this is expected behavior. When you compare latin1 column to Unicode user variable, Unicode "wins" (see http://dev.mysql.com/doc/refman/5.1/en/charset-collate-tricky.html for the details), and server has to apply implicit character set conversion to the column, thus preventing the use of index. It is not a bug.

As for warning in cases like that (in some SQL modes at least), I think this is a reasonable feature request. Do you agree?
[11 Jun 2009 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[17 Jun 2009 9:33] Simon Moore
Yes I agree
[17 Jun 2009 10:06] Valeriy Kravchuk
Feature request is: give warnings when index can not be used because column must be converted to a diiferent colation.
[25 Jun 2009 9:44] Valeriy Kravchuk
Simpler test case that show how collation for @parameter can change the plan (assuming lsatin1 is used by default):

create table bug44785(c1 int auto_increment primary key, c2 varchar(20), key(c2));
insert into bug44785(c2) values('a');
insert into bug44785(c2) select c2 from bug44785;
insert into bug44785(c2) select c2 from bug44785;
insert into bug44785(c2) select c2 from bug44785;
insert into bug44785(c2) select c2 from bug44785;
insert into bug44785(c2) select c2 from bug44785;
insert into bug44785(c2) select c2 from bug44785;
insert into bug44785(c2) select c2 from bug44785;
insert into bug44785(c2) select c2 from bug44785;
insert into bug44785(c2) select c2 from bug44785;
insert into bug44785(c2) select c2 from bug44785;
insert into bug44785(c2) select c2 from bug44785;

insert into bug44785(c2) values('b');
explain select * from bug44785 where c2 = 'b'\G
set @b='b';
explain select * from bug44785 where c2 = @b\G
select collation(@b);

set collation_connection="utf8_unicode_ci"; 
set collation_database="utf8_unicode_ci"; 
set collation_server="utf8_unicode_ci";

set @b='b';
select collation(@b);
explain select * from bug44785 where c2 = @b\G
explain select * from bug44785 where c2 = 'b'\G
select collation('b');

I've got the following results:

openxs@suse:/home2/openxs/dbs/5.1> bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.1.37-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create table bug44785(c1 int auto_increment primary key, c2 varchar(20), key(c2));
Query OK, 0 rows affected (0.03 sec)

mysql> insert into bug44785(c2) values('a');
Query OK, 1 row affected (0.01 sec)

mysql> insert into bug44785(c2) select c2 from bug44785;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> insert into bug44785(c2) select c2 from bug44785;
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> insert into bug44785(c2) select c2 from bug44785;
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> insert into bug44785(c2) select c2 from bug44785;
Query OK, 8 rows affected (0.00 sec)
Records: 8  Duplicates: 0  Warnings: 0

mysql> insert into bug44785(c2) select c2 from bug44785;
Query OK, 16 rows affected (0.00 sec)
Records: 16  Duplicates: 0  Warnings: 0

mysql> insert into bug44785(c2) select c2 from bug44785;
Query OK, 32 rows affected (0.01 sec)
Records: 32  Duplicates: 0  Warnings: 0

mysql> insert into bug44785(c2) select c2 from bug44785;
Query OK, 64 rows affected (0.01 sec)
Records: 64  Duplicates: 0  Warnings: 0

mysql> insert into bug44785(c2) select c2 from bug44785;
Query OK, 128 rows affected (0.03 sec)
Records: 128  Duplicates: 0  Warnings: 0

mysql> insert into bug44785(c2) select c2 from bug44785;
Query OK, 256 rows affected (0.08 sec)
Records: 256  Duplicates: 0  Warnings: 0

mysql> insert into bug44785(c2) select c2 from bug44785;
Query OK, 512 rows affected (0.13 sec)
Records: 512  Duplicates: 0  Warnings: 0

mysql> insert into bug44785(c2) select c2 from bug44785;
Query OK, 1024 rows affected (0.22 sec)
Records: 1024  Duplicates: 0  Warnings: 0

mysql> insert into bug44785(c2) values('b');
Query OK, 1 row affected (0.00 sec)

mysql> explain select * from bug44785 where c2 = 'b'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: bug44785
         type: ref
possible_keys: c2
          key: c2
      key_len: 23
          ref: const
         rows: 2
        Extra: Using where
1 row in set (0.01 sec)

mysql> set @b='b';
Query OK, 0 rows affected (0.00 sec)

mysql> explain select * from bug44785 where c2 = @b\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: bug44785
         type: ref
possible_keys: c2
          key: c2
      key_len: 23
          ref: const
         rows: 2
        Extra: Using where
1 row in set (0.00 sec)

mysql> select collation(@b);
+-------------------+
| collation(@b)     |
+-------------------+
| latin1_swedish_ci |
+-------------------+
1 row in set (0.00 sec)

mysql> set collation_connection="utf8_unicode_ci"; set collation_database="utf8_unicode_ci"; set collation_server="utf8_unicode_ci";
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

mysql> set @b='b';
Query OK, 0 rows affected (0.00 sec)

mysql> select collation(@b);
+-----------------+
| collation(@b)   |
+-----------------+
| utf8_unicode_ci |
+-----------------+
1 row in set (0.00 sec)

mysql> explain select * from bug44785 where c2 = @b\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: bug44785
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2049
        Extra: Using where
1 row in set (0.00 sec)

mysql> explain select * from bug44785 where c2 = 'b'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: bug44785
         type: ref
possible_keys: c2
          key: c2
      key_len: 23
          ref: const
         rows: 2
        Extra: Using where
1 row in set (0.00 sec)

mysql> select collation('b');
+-----------------+
| collation('b')  |
+-----------------+
| utf8_unicode_ci |
+-----------------+
1 row in set (0.00 sec)

mysql> show create table bug44785\G
*************************** 1. row ***************************
       Table: bug44785
Create Table: CREATE TABLE `bug44785` (
  `c1` int(11) NOT NULL AUTO_INCREMENT,
  `c2` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`c1`),
  KEY `c2` (`c2`)
) ENGINE=MyISAM AUTO_INCREMENT=2050 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

So, while collation is the same for both 'b' and @b, but the plans arte ddifferent.
[26 Jun 2009 12:27] Georgi Kodinov
I think this is not a regression bug.
The behavior is very well documented and seems logical to me.
I've tried with 5.1.31 and I don't see a difference in behavior with 5.1-bk.

As for the warning message : I don't think it's a good idea. Why should we give warnings for working as documented ?
Eventually we could always convert the literal to the character set of the column and give out "not found" on conversion error (since you'll never find e.g. a cyrillic utf-8 in a set of latin1 strings anyway).