Bug #115687 Index Not Used in Stored Proc when input param is varchar and db col is nvarchar
Submitted: 25 Jul 2024 4:39 Modified: 5 Aug 2024 10:04
Reporter: Scott Williams Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S4 (Feature request)
Version:9.0 OS:Any
Assigned to: CPU Architecture:Any

[25 Jul 2024 4:39] Scott Williams
Description:
When you create a table using DDL with nvarchar in MySQL 8 it creates a table with the default charset for nvarchar. If you create a stored procedure and delcare an input parameter with varchar, and then use that parameter in the where clause of a query, MySQL gets confused and does not use the appropriate index. Let's say you have create a user table with username being nvarchar.  You then create a unique index on username.  You then create a stored procedure that takes username as an input parameter and the stored proc user that input parameter in a a query ... update usertable set X to Y where username = storedProcParam.  The query works, but it does not use the username index because the MySQL engine somehow gets confused because the input param is varchar and not nvarchar.  If you create the same stored procedure and you declare the input param as nvarchar it works correctly.  If the stored procedure with the varchar / nvarchar mismatch threw an error, that would be acceptable. However accepting the input and working without error but doing a full table scan and ignoring the index on the 1 field in the where clause is unacceptable.  In mysql 5.7 this was a non issue.

How to repeat:
1. create a table and declare a column as nvarchar
2. create a unique index constraint on that column
3. create a stored procedure that take "varchar" as the input parameter
4. The stored procedure should have a query with a where clause where you would expect the index from step 2 to be used.

Suggested fix:
Ideally a varchar input to a stored procedure should be transparently converted to work with nvarchar indexes.  If that is not possible, then the engine should throw an error stating that varchar and nvarchar are incompatible.  

It is unacceptable to take varchar input and for the query to work, but not use the appropriate index.
[25 Jul 2024 9:32] MySQL Verification Team
Hi Mr. Williams,

Thank you for your bug report.

However, let us inform you that this is a forum for the reports with a fully repeatable test cases. Each test case should consist of the set of SQL statements that always lead to the behaviour that is reported. It does not have to be a bug.

Also, please use the latest release, which in this case is 8.0.39.

Can't repeat.
[25 Jul 2024 16:02] Scott Williams
I added steps to repeat?  Did you try the steps?
[25 Jul 2024 16:29] Scott Williams
Here are the steps to reproduce.

create table userTest (
	username Nvarchar(50);
);
ALTER TABLE userTest ADD CONSTRAINT UC_USER_USERNAME UNIQUE (username);
insert into userTest(username) values ('Scott');

DELIMITER $$
CREATE PROCEDURE `sp_test`(In userNameParam varchar(100))
    SQL SECURITY INVOKER
BEGIN 
        Explain select * from userTest where username =  userNameParam;
                
END$$
DELIMITER ;

call sp_test('Scott');

The expected output is the explain should choose the UC_User_username constraint, but it doesnt
[25 Jul 2024 16:29] Scott Williams
Here are the steps to reproduce.

create table userTest (
	username Nvarchar(50);
);
ALTER TABLE userTest ADD CONSTRAINT UC_USER_USERNAME UNIQUE (username);
insert into userTest(username) values ('Scott');

DELIMITER $$
CREATE PROCEDURE `sp_test`(In userNameParam varchar(100))
    SQL SECURITY INVOKER
BEGIN 
        Explain select * from userTest where username =  userNameParam;
                
END$$
DELIMITER ;

call sp_test('Scott');

The expected output is the explain should choose the UC_User_username constraint, but it doesnt
[25 Jul 2024 18:00] Scott Williams
It looks like this bug started with 8.0.39

when these collations became the default:
mysql> show variables like '%collat%';
+-------------------------------+--------------------+
| Variable_name                 | Value              |
+-------------------------------+--------------------+
| collation_connection          | utf8mb4_general_ci |
| collation_database            | utf8mb4_0900_ai_ci |
| collation_server              | utf8mb4_0900_ai_ci |
| default_collation_for_utf8mb4 | utf8mb4_0900_ai_ci |
+-------------------------------+--------------------+
[26 Jul 2024 9:48] MySQL Verification Team
Hi Mr. Williams,

We truly do not see what is a bug there.

For the nvarchar, character set has been upgraded to UTF8MB_4.

You have added a constraint of uniqueness and inserted only one row.

Hence, there is no proof that constraint is not working.

Not a bug.
[26 Jul 2024 12:37] Nathan Williams
Hi MySQL team,

We can replicate the issue in 8.0.39 with the optimizer not choosing the index on the unique key UC_USERTEST_USERNAME when using a stored procedure with a varchar input, even though the field is declared as a varchar. this was observed in a table with ~150,000 rows, which is how it was noticed, due to the performance impact, although it is replicable also with a single row as in the test case below.

to replicate:
```
create database usersTest;
use usersTest;

CREATE TABLE `userTest` (
  `id` varchar(36) CHARACTER SET utf8 NOT NULL,
  `username` varchar(100) CHARACTER SET utf8 NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `UC_USERTEST_USERNAME` (`username`),
  `status` varchar(50) NOT NULL
);

insert into userTest(`id`, `username`, `status`) values ('2a812430-608e-423f-8828-952b9c94b6d4', 'dummy', 'Locked');

DELIMITER $$
CREATE DEFINER=`root`@`%` PROCEDURE `sp_varcharTest`(In userNameParam varchar(100))
    SQL SECURITY INVOKER
BEGIN
              EXPLAIN update userTest set status = 'Active' where username = userNameParam;
END$$
DELIMITER ;

DELIMITER $$
CREATE DEFINER=`root`@`%` PROCEDURE `sp_nvarcharTest`(In userNameParam nvarchar(100))
    SQL SECURITY INVOKER
BEGIN
              EXPLAIN update userTest set status = 'Active' where username = userNameParam;
END$$
DELIMITER ;

call sp_varcharTest('dummy');

call sp_nvarcharTest('dummy');
```

example:
```
mysql> create database usersTest;
Query OK, 1 row affected (0.01 sec)

mysql> use usersTest;
Database changed
mysql>
mysql> CREATE TABLE `userTest` (
    ->   `id` varchar(36) CHARACTER SET utf8 NOT NULL,
    ->   `username` varchar(100) CHARACTER SET utf8 NOT NULL,
    ->   PRIMARY KEY (`id`),
    ->   UNIQUE KEY `UC_USERTEST_USERNAME` (`username`),
    ->   `status` varchar(50) NOT NULL
    -> );

insert into userTest(`id`, `username`, `status`) values ('2a812430-608e-423f-8828-952b9c94b6d4', 'dummy', 'Locked');

DELIMITER $$
CREATE DEFINER=`root`@`%` PROCEDURE `sp_varcharTest`(In userNameParam varchar(100))
    SQL SECURITY INVOKER
BEGIN
              EXPLAIN update userTest set status = 'Active' where username = userNameParam;
END$$
DELIMITER ;

DELIMITER $$
CREATE DEFINER=`root`@`%` PROCEDURE `sp_nvarcharTest`(In userNameParam nvarchar(100))
    SQL SECURITY INVOKER
BEGIN
              EXPLAIN update userTest set status = 'Active' where username = userNameParam;
END$$
DELIMITER ;

call sp_varcharTest('dummy');

call sp_nvarcharTest('dummy');Query OK, 0 rows affected, 2 warnings (0.03 sec)

mysql>
mysql> insert into userTest(`id`, `username`, `status`) values ('2a812430-608e-423f-8828-952b9c94b6d4', 'dummy', 'Locked');
Query OK, 1 row affected (0.00 sec)

mysql>
mysql> DELIMITER $$
mysql> CREATE DEFINER=`root`@`%` PROCEDURE `sp_varcharTest`(In userNameParam varchar(100))
    ->     SQL SECURITY INVOKER
    -> BEGIN
    ->               EXPLAIN update userTest set status = 'Active' where username = userNameParam;
    -> END$$
Query OK, 0 rows affected (0.00 sec)

mysql> DELIMITER ;
mysql>
mysql> DELIMITER $$
mysql> CREATE DEFINER=`root`@`%` PROCEDURE `sp_nvarcharTest`(In userNameParam nvarchar(100))
    ->     SQL SECURITY INVOKER
    -> BEGIN
    ->               EXPLAIN update userTest set status = 'Active' where username = userNameParam;
    -> END$$
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> DELIMITER ;
mysql>
mysql> call sp_varcharTest('dummy');
+----+-------------+----------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table    | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | UPDATE      | userTest | NULL       | index | NULL          | PRIMARY | 110     | NULL |    1 |   100.00 | Using where |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected, 2 warnings (0.00 sec)

mysql>
mysql> call sp_nvarcharTest('dummy');
+----+-------------+----------+------------+-------+----------------------+----------------------+---------+-------+------+----------+-------------+
| id | select_type | table    | partitions | type  | possible_keys        | key                  | key_len | ref   | rows | filtered | Extra       |
+----+-------------+----------+------------+-------+----------------------+----------------------+---------+-------+------+----------+-------------+
|  1 | UPDATE      | userTest | NULL       | range | UC_USERTEST_USERNAME | UC_USERTEST_USERNAME | 302     | const |    1 |   100.00 | Using where |
+----+-------------+----------+------------+-------+----------------------+----------------------+---------+-------+------+----------+-------------+
1 row in set (0.01 sec)

Query OK, 0 rows affected, 1 warning (0.01 sec)
```

as you can see it does not choose the index unless specifying the input as nvarchar.
[26 Jul 2024 12:39] Nathan Williams
replication case

Attachment: replication.sql (application/octet-stream, text), 940 bytes.

[26 Jul 2024 13:20] MySQL Verification Team
Hi Mr. Williams,

This is not a bug.

You are comparing two strings with different collations .......

Not a bug.
[26 Jul 2024 16:41] Scott Williams
I think it is a bug that these 2 statements work inconsistently.

1:  select * from users where username = 'Scott'
vs
2: call spGetUser('Scott')

In both cases the string 'Scott' is ultimately used in a where class.

It should be irrelevant if the input parameter is declared as varchar or nvarchar.  In either case 'Scott' should be evaluated against the index.  No other database system has this flaw, and mysql 5.7 didn't have this flaw.
[29 Jul 2024 9:54] MySQL Verification Team
Hi Mr. Williams,

NVARCHAR and VARCHAR are not the same.

Not a bug.
[29 Jul 2024 22:24] Scott Williams
nvarchar and varchar are different but the value 'Scott' is the same.  varhar is a subset of nvarchar.  A varchar value works fine in a where clause, and given that it works fine it should use the appropriate index.  It is ridiculous that the explain of these 2 queries would be different, both have the exact same value 'Scott' in the where clause.

-- drop procedure sp_test;
DELIMITER $$
CREATE DEFINER=`root`@`%` PROCEDURE `sp_test`()
    SQL SECURITY INVOKER
BEGIN 
	declare userNameParam1 varchar(100);
        declare userNameParam2 nvarchar(100);
        select 'Scott' into userNameParam1;
        select 'Scott' into userNameParam2;
        Explain select * from userTest where username =  userNameParam1;
        Explain select * from userTest where username =  userNameParam2;
                
END$$
DELIMITER ;
SELECT * FROM custom_wings.userTest;

call sp_test();
[30 Jul 2024 9:22] MySQL Verification Team
Hi Mr. Scott,

MySQL is strictly following SQL and Unicode standards on this issue and we can not break those.

But, you have so many other solutions for your problem, like introducers .......

Also,  you do not  really need Unicode for English strings ....
[30 Jul 2024 17:07] Scott Williams
In mysql 5.7 this stored proc returns the same explain statement for both.  In 8.039 it changed, and now one uses the index and one doesn't.

your query planner and logic for choosing an index or not, is not part of sql standard.  That is specific to mysql's implementation.  

Both statements return the same results and work, but one does not choose the correct index.  This is a bug with your query planner.
[31 Jul 2024 10:09] MySQL Verification Team
Hi Mr. Williams,

When we wrote that MySQL is following standards on this issue, was exactly what we meant.

MySQL is following standards on the issues of the character sets and collations. And those have changed between 5.7 and 8.0.

MySQL 5.7 was based on utfmb3, while 8.0 is based on utfmb4 and hence a difference.

Not a bug.
[5 Aug 2024 10:04] MySQL Verification Team
Hi Mr. Williams,

This is to inform you that we decided to verify your report as a feature request that could be implemented in the version 9.0 or higher.

Verified as a feature request.