Bug #70828 UNION syntax missing required parentheses
Submitted: 6 Nov 2013 4:14 Modified: 27 Feb 2014 0:08
Reporter: Dustin Bortner Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / NET Severity:S1 (Critical)
Version:6.7.4.1 OS:Windows
Assigned to: Fernando Gonzalez.Sanchez CPU Architecture:Any

[6 Nov 2013 4:14] Dustin Bortner
Description:
MySQL queries produced by the LINQ Concat() or Union() operators are missing parentheses around the individual select statements.  This causes incorrect semantics, such as when LIMIT is applied to an individual select statement in the UNION, because if the LIMIT is applied to the last select statement and parentheses are not used to limit its scope, the MySQL parser applies it to the entire UNION result instead.  MySQL Manual (versions 5.0 - 5.7) section 13.2.9.4. UNION Syntax states:

"To apply ORDER BY or LIMIT to an individual SELECT, place the clause inside the parentheses that enclose the SELECT. ... To use an ORDER BY or LIMIT clause to sort or limit the entire UNION result, parenthesize the individual SELECT statements and place the ORDER BY or LIMIT after the last one."

It also provides examples showing parentheses around the individual select statements in the UNION.  Without these parentheses, the semantics of LINQ is not obeyed.

How to repeat:
We have the following C# example code:

var query1 = Context.SocialNetworks;
var query2 = query1.Take(0).Concat(query1);
var query3 = query1.Concat(query1.Take(0));

where Context is an Entity Framework 5.0 DbContext and SocialNetworks is a table entity. query2 returns as many rows as are in the SocialNetwork table, while query3 returns zero rows.  Since we save simply switched the order of the relations in the Concat(), we should see the same number of rows returned from both queries.  The correct semantics is that the Take() should be applied before the Concat() is applied, so this query should return all rows in the SocialNetworks table.  

We can see in the following SQL emitted by these queries, query2 happens to work correctly because the SQL parses the LIMIT as belonging to the first select statement only.  But, the LIMIT for query3 is parsed as applied to the entire UNION result. 

##### query2:

SELECT 
    `UnionAll1`.`Id` AS `C1`, `UnionAll1`.`Name` AS `C2`
FROM
    (SELECT 
        `Id`, `Name`
    FROM
        `SocialNetworks`
    LIMIT 0 UNION ALL SELECT 
        `Extent2`.`Id`, `Extent2`.`Name`
    FROM
        `SocialNetworks` AS `Extent2`) AS `UnionAll1`

##### query3:

SELECT 
    `UnionAll1`.`Id` AS `C1`, `UnionAll1`.`Name` AS `C2`
FROM
    (SELECT 
        `Extent1`.`Id`, `Extent1`.`Name`
    FROM
        `SocialNetworks` AS `Extent1` UNION ALL SELECT 
        `Id`, `Name`
    FROM
        `SocialNetworks`
    LIMIT 0) AS `UnionAll1`

Suggested fix:
Always surround the individual select statements of a UNION or UNION ALL with parentheses. Then we are guaranteed to obey the correct semantics when LIMIT is involved.
[10 Feb 2014 22:36] Fernando Gonzalez.Sanchez
Posted by developer:
 
With the fix, the limit is wrapped in parenthesis with the nearest query (so its semantics do not apply to the whole union query).

This fix will appear in Connector/NET 6.6.7, 6.7.5 & 6.8.4.
[27 Feb 2014 0:08] Daniel So
Added the following entry into the Connector/Net 6.8.4, 6.7.5, and 6.6.7 changelogs:

"When Connector/Net's SQL generator emitted code for the LINQ Union() or Concat() operator, parentheses were not applied around the individual SELECT statements. That could cause a change of meaning for the query when a Take() operator (thus a LIMIT clause in the SQL code) was applied to the last SELECT statement. With this fix, parentheses were put around individual SELECT statements, so that the LIMIT clause will not be applied to the entire result of the UNION operation."