Bug #64935 LINQ to SQL's StartsWith() and Contains() generate slow LOCATE() instead of LIKE
Submitted: 10 Apr 2012 22:53 Modified: 19 Jul 2012 23:02
Reporter: Alex K. Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / NET Severity:S3 (Non-critical)
Version:6.5.4 OS:Windows (7 x64)
Assigned to: Fernando Gonzalez.Sanchez CPU Architecture:Any
Tags: .net, connector, linq, string, wildcard

[10 Apr 2012 22:53] Alex K.
Description:
SELECT * FROM Users WHERE Firstname LIKE 'Mi%';

Will return all users with a first name starting with "Mi" (Michael, Mike, Michelangelo, ...).
What's even more important is the way "LIKE 'Mi%'" works: if the column "Firstname" has an index set, the above statement will make use of that index and NOT do a full table scan (with >1,000,000 users this is a no go for e.g. an autocomplete-function).

However, when using MySQL's .NET Connector for LINQ to SQL the following LINQ:

var results = (from user in DB.Users where user.Firstname.StartsWith("Mi") select user.Firstname).ToList();

will NOT generate the above statement but the following:

SELECT `Extent1`.`Firstname` FROM `Users` AS `Extent1` WHERE (LOCATE('Mi', `Extent1`.`Firstname`)) = 1

Which WILL result in a full table scan because MySQL LOCATE() does not make use of the index. Instead returns the location of the substring within a string, which we didn't ask for.

This is a serious (!) performance bottleneck and currently there's no way to use LINQ for this.

The same logic applies for:
var results = (from user in DB.Users where user.Firstname.Contains("Mi") select user.Firstname).ToList();

This also generates a LOCATE() instead of a "LIKE '%Mi%'".
Of course, in this case neither will make use of the index. But nevertheless, we're interested if "Mi" is a substring (thus Contains() returns bool) and not the position of the substring within the column (that's what LOCATE() does and probably should do for IndexOf() but not Contains()).

How to repeat:
Do a LINQ with a StartsWith() on an indexed VARCHAR-column like in the example above:

var results = (from user in DB.Users where user.Firstname.StartsWith("Mi") select user.Firstname).ToList();
[19 Jul 2012 23:02] John Russell
Added to changelog for 6.5.5: 

The LINQ to SQL data provider for MySQL was generating inefficient
code for the StartsWith() and Contains() methods, calling the MySQL
LOCATE() function rather than using a LIKE operator with a %
wildcard. The fix causes both methods to use the LIKE syntax,
although only StartsWith() gains a substantial performance
improvement. Queries involving the StartsWith() method can now take
advantage of an index on the corresponding column.
[17 Mar 2014 10:46] Carl Partridge
I believe that there has been a regression here, this seems to be a problem when using Connector/Net 6.8.3; should I file a new bug or can this be re-opened?
[17 Mar 2014 11:04] Carl Partridge
I have filed a new bug report here: http://bugs.mysql.com/bug.php?id=72058