Bug #86187 WHERE IN not mapped with Entity Framework Core
Submitted: 4 May 2017 19:30 Modified: 31 May 2017 13:57
Reporter: Maxime Beaudry Email Updates:
Status: Verified Impact on me:
None 
Category:Connector / NET Severity:S3 (Non-critical)
Version:7.0.7-m61 OS:Windows
Assigned to: CPU Architecture:Any

[4 May 2017 19:30] Maxime Beaudry
Description:
Using MySql.Data.EntityFrameworkCore 7.0.7-m61, when I write a LINQ query that intends to generates "WHERE IN" on a nullable column, it's pretty easy to write code that results in pulling all rows out of the database and then applying the filter in memory.  

Look at the example below. You will see that one of the query pulls all the rows of the table in memory and applies the filter in memory.

It's not the first time that I get in a situation where MySql.Data.EntityFrameworkCore fails to analyze the expression tree that I code. As a result, it pulls way too much data into memory and leads to very poor performance. 

==> Would there be a way to be notified (warning, exception, message) when the EF adapter fails to parse the expression tree and decides to apply a filter in memory? This could greatly help in minimizing the number of issues in production.

How to repeat:
** Note that the code is available at https://github.com/mabead/Bug.MySqlEntityFrameworkCoreWhereIn

1) Run the following script to setup the bd: 
--------------------------------------------

CREATE SCHEMA bugdemo;
USE bugdemo;

CREATE TABLE Demos (
  `Key`						INT NOT NULL AUTO_INCREMENT,
  Id						INT,
   PRIMARY KEY (`Key`)
);

INSERT INTO Demos (Id) VALUES (1),(2),(3),(5),(8);

SELECT * FROM Demos;
SELECT * FROM Demos WHERE Id IN (2, 5, 200);

2) Compile and run the following code:
--------------------------------------

using System;
using System.Linq;
using Microsoft.EntityFrameworkCore;
using MySQL.Data.EntityFrameworkCore.Extensions;

namespace MyConsole
{
    public class Demo
    {
        public int Key { get; set; }
        public int? Id { get; set; }
    }

    public class Context : DbContext
    {
        public Context(DbContextOptions<Context> options) : base(options)
        {
        }

        public DbSet<Demo> Demos { get; set; }

    }

    class Program
    {
        static void Main(string[] args)
        {
            var builder = new DbContextOptionsBuilder<Context>()
                .UseMySQL("server=localhost;user=root;password=root;Database=bugdemo;port=3306;SslMode=None;default command timeout=1800");

            var context = new Context(builder.Options);

            var alldemos = context.Demos.ToList();
            Console.WriteLine($"table contains {alldemos.Count} rows");

            // Will generate the following query:
            //
            //   SELECT `x`.`Id`, `x`.`Key` FROM `Demos` AS `x` WHERE `x`.`Id` IN (2, 5, 200)
            //
            var nullableIds = new int?[] { 2, 5, 200 };
            var whereInResult1 = context.Demos.Where(x => nullableIds.Contains(x.Id)).ToList();
            Console.WriteLine($"where in 1 returns {whereInResult1.Count} rows");

            // Will generate the following query that lacks the 'WHERE IN':
            //
            //   SELECT `x`.`Id`, `x`.`Key` FROM `Demos` AS `x`
            //
            var nonNullableIds = new int[] { 2, 5, 200 };
            var whereInResult2 = context.Demos.Where(x => nonNullableIds.Contains(x.Id.Value)).ToList();
            Console.WriteLine($"where in 2 returns {whereInResult2.Count} rows");
        }
    }
}
 

3) You will see that the second query applies the filter in memory and not in the database.

Suggested fix:

Issue a warning that can be enabled / disabled through the standard warning mechanism when a filter is applied in memory. Ex:

            var builder = new DbContextOptionsBuilder<DistributionContext>()
                .UseInMemoryDatabase("foo" + Guid.NewGuid())
                .ConfigureWarnings(w => w.Ignore(InMemoryEventId.TransactionIgnoredWarning));
[8 May 2017 9:39] Chiranjeevi Battula
Hello  Maxime Beaudry,

Thank you for the bug report.
I could not repeat the issue at our end using with Visual Studio 2015, Connector/NET 7.0.7-m61 version.
Could you please provide repeatable test case (exact steps/sample project, screenshot etc. - please make it as private if you prefer) to confirm this issue at our end?

Thanks,
Chiranjeevi.
[8 May 2017 9:39] Chiranjeevi Battula
Screenshot

Attachment: 86187.png (image/png, text), 99.44 KiB.

[8 May 2017 13:07] Maxime Beaudry
Maybe I didn't explain properly. There is no functional problem. There is a performance problem. All the queries return the expected object. But if you enable the MySql general logs, you will see the following queries:

==> this is generated by context.Demos.ToList();

		    4 Query	SELECT `d`.`Id`, `d`.`Key`
FROM `Demos` AS `d`

==> this is generated by context.Demos.Where(x => nullableIds.Contains(x.Id)).ToList()
==> It generates the SQL that I want.

170508  9:04:50	    4 Init DB	bugdemo
		    4 Query	SELECT `x`.`Id`, `x`.`Key`
FROM `Demos` AS `x`
WHERE `x`.`Id` IN (2, 5, 200)

==> This is generated by context.Demos.Where(x => nonNullableIds.Contains(x.Id.Value)).ToList()
==> It generates a sub-optimal query since the filtering is done in memory. This is where I would 
==> like to have the MySql driver generate a warning / error.
170508  9:04:55	    4 Init DB	bugdemo
		    4 Query	SELECT `x`.`Id`, `x`.`Key`
FROM `Demos` AS `x`

Can you confirm that you see these entries in your MySql general logs?
[9 May 2017 5:32] Chiranjeevi Battula
Hello  Maxime Beaudry,

Thank you for the feedback.
Verified this behavior on Visual Studio 2015 (C#.Net) and Connector/NET 7.0.7-m61 version.

Thanks,
Chiranjeevi.
[9 May 2017 5:34] Chiranjeevi Battula
170509  6:34:16	    5 Connect	root@localhost on test
		    5 Query	SHOW VARIABLES
		    5 Query	SELECT TIMEDIFF(NOW(), UTC_TIMESTAMP())
		    5 Query	SHOW COLLATION
		    5 Query	SET character_set_results=NULL
		    5 Init DB	test
		    5 Query	SELECT `d`.`Id`, `d`.`Key`
FROM `Demos` AS `d`
170509  6:34:19	    5 Init DB	test
		    5 Query	SELECT `x`.`Id`, `x`.`Key`
FROM `Demos` AS `x`
WHERE `x`.`Id` IN (2, 5, 200)
170509  6:34:21	    4 Query	SHOW GLOBAL STATUS
170509  6:34:24	    5 Init DB	test
		    5 Query	SELECT `x`.`Id`, `x`.`Key`
FROM `Demos` AS `x`
170509  6:34:32	    5 Quit
[31 May 2017 13:57] Maxime Beaudry
Any update?