Bug #24484 | Aggregate function used in column list subquery gives erroneous error | ||
---|---|---|---|
Submitted: | 21 Nov 2006 20:27 | Modified: | 7 Apr 2007 19:10 |
Reporter: | Harrison Fisk | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.0.28 | OS: | MacOS (Mac OS X) |
Assigned to: | Georgi Kodinov | CPU Architecture: | Any |
Tags: | bfsm_2006_12_07 |
[21 Nov 2006 20:27]
Harrison Fisk
[2 Feb 2007 17:30]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/19260 ChangeSet@1.2400, 2007-02-02 19:29:03+02:00, gkodinov@macbook.gmz +5 -0 Bug #24484: Aggregate function used in column list subquery gives erroneous error Most of the aggregate functions depend on the number of rows they process, even if their arguments are constants (e.g. SUM(1) cannot be calculated without knowing the number of rows that it is called for). There are however two exceptions : MIN and MAX. These depend only on their arguments : so if they are constant, them MIN and MAX can be considered constants. In that respect MIN and MAX are just as any other scalar SQL function. However they were erroneously treated as the other aggregates by the optimizer. Fixed by introducing in Item_sum_hybrid (the superclass of MIN and MAX) the same calculation for used_tables as for Item_func. One other flaw was revealed and fixed in the process : references were not calling the recalculation method for used_tables of their targets.
[6 Feb 2007 14:29]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/19398 ChangeSet@1.2400, 2007-02-06 16:29:05+02:00, gkodinov@macbook.gmz +6 -0 Bug #24484: Most of the aggregate functions depend on the number of rows they process, even if their arguments are constants (e.g. SUM(1) cannot be calculated without knowing the number of rows that it is called for). There are however two exceptions : MIN and MAX. These depend only on their arguments : so if they are constant, them MIN and MAX can be considered constants. In that respect MIN and MAX are just as any other scalar SQL function. However they were erroneously treated as the other aggregates by the optimizer. Fixed by introducing in Item_sum_hybrid (the superclass of MIN and MAX) the same calculation for used_tables as for Item_func. One other flaw was revealed and fixed in the process : references were not calling the recalculation method for used_tables of their targets. Also used_tables() for other aggregates now excludes the special "non-table" bits (like OUTER_REF_TABLE_BIT for example).
[8 Feb 2007 16:02]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/19570 ChangeSet@1.2400, 2007-02-08 18:02:10+02:00, gkodinov@macbook.gmz +6 -0 Bug #24484: To correctly decide which predicates can be evaluated with a given table the optimizer must know the exact set of tables that a predicate depends on. If that mask is too wide (refer to non-existing tables) the optimizer can erroneously skip a predicate. One such case of wrong table usage mask were the aggregate functions. The have a all-1 mask (meaning depend on all tables, including non-existent ones). Fixed by making a real used_tables mask for the aggregates. The mask is constructed in the following way : 1. OR the table dependency masks of all the arguments of the aggregate. 2. If all the arguments of the function are from the local name resolution context; it's not a MAX/MIN and it is evaluated in the same name resolution context where it is referenced all the tables from that name resolution context are OR-ed to the dependency mask. This is to denote that an aggregate function depends on the number of rows it processes. 3. Handle correctly the case of an aggregate function optimization (such that the aggregate function can be pre-calculated and made a constant). Made sure that an aggregate function is never a constant (unless subject of a specific optimization and pre-calculation). One other flaw was revealed and fixed in the process : references were not calling the recalculation method for used_tables of their targets.
[23 Feb 2007 10:32]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/20446 ChangeSet@1.2400, 2007-02-23 12:31:35+02:00, gkodinov@macbook.gmz +6 -0 Bug #24484: To correctly decide which predicates can be evaluated with a given table the optimizer must know the exact set of tables that a predicate depends on. If that mask is too wide (refer to non-existing tables) the optimizer can erroneously skip a predicate. One such case of wrong table usage mask were the aggregate functions. The have a all-1 mask (meaning depend on all tables, including non-existent ones). Fixed by making a real used_tables mask for the aggregates. The mask is constructed in the following way : 1. OR the table dependency masks of all the arguments of the aggregate. 2. If all the arguments of the function are from the local name resolution context; it's not a MAX/MIN and it is evaluated in the same name resolution context where it is referenced all the tables from that name resolution context are OR-ed to the dependency mask. This is to denote that an aggregate function depends on the number of rows it processes. 3. Handle correctly the case of an aggregate function optimization (such that the aggregate function can be pre-calculated and made a constant). Made sure that an aggregate function is never a constant (unless subject of a specific optimization and pre-calculation). One other flaw was revealed and fixed in the process : references were not calling the recalculation method for used_tables of their targets.
[9 Mar 2007 17:42]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/21625 ChangeSet@1.2400, 2007-03-09 17:57:51+02:00, gkodinov@macbook.gmz +12 -0 Bug #24484: To correctly decide which predicates can be evaluated with a given table the optimizer must know the exact set of tables that a predicate depends on. If that mask is too wide (refer to non-existing tables) the optimizer can erroneously skip a predicate. One such case of wrong table usage mask were the aggregate functions. The have a all-1 mask (meaning depend on all tables, including non-existent ones). Fixed by making a real used_tables mask for the aggregates. The mask is constructed in the following way : 1. OR the table dependency masks of all the arguments of the aggregate. 2. If all the arguments of the function are from the local name resolution context; it's not a MAX/MIN and it is evaluated in the same name resolution context where it is referenced all the tables from that name resolution context are OR-ed to the dependency mask. This is to denote that an aggregate function depends on the number of rows it processes. 3. Handle correctly the case of an aggregate function optimization (such that the aggregate function can be pre-calculated and made a constant). Made sure that an aggregate function is never a constant (unless subject of a specific optimization and pre-calculation). One other flaw was revealed and fixed in the process : references were not calling the recalculation method for used_tables of their targets.
[13 Mar 2007 16:42]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/21822 ChangeSet@1.2400, 2007-03-13 16:44:19+02:00, gkodinov@macbook.gmz +10 -0 Bug #24484: To correctly decide which predicates can be evaluated with a given table the optimizer must know the exact set of tables that a predicate depends on. If that mask is too wide (refer to non-existing tables) the optimizer can erroneously skip a predicate. One such case of wrong table usage mask were the aggregate functions. The have a all-1 mask (meaning depend on all tables, including non-existent ones). Fixed by making a real used_tables mask for the aggregates. The mask is constructed in the following way : 1. OR the table dependency masks of all the arguments of the aggregate. 2. If all the arguments of the function are from the local name resolution context and it is evaluated in the same name resolution context where it is referenced all the tables from that name resolution context are OR-ed to the dependency mask. This is to denote that an aggregate function depends on the number of rows it processes. 3. Handle correctly the case of an aggregate function optimization (such that the aggregate function can be pre-calculated and made a constant). Made sure that an aggregate function is never a constant (unless subject of a specific optimization and pre-calculation). One other flaw was revealed and fixed in the process : references were not calling the recalculation method for used_tables of their targets.
[14 Mar 2007 17:13]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/21909 ChangeSet@1.2400, 2007-03-14 15:52:27+02:00, gkodinov@macbook.gmz +9 -0 Bug #24484: To correctly decide which predicates can be evaluated with a given table the optimizer must know the exact set of tables that a predicate depends on. If that mask is too wide (refer to non-existing tables) the optimizer can erroneously skip a predicate. One such case of wrong table usage mask were the aggregate functions. The have a all-1 mask (meaning depend on all tables, including non-existent ones). Fixed by making a real used_tables mask for the aggregates. The mask is constructed in the following way : 1. OR the table dependency masks of all the arguments of the aggregate. 2. If all the arguments of the function are from the local name resolution context and it is evaluated in the same name resolution context where it is referenced all the tables from that name resolution context are OR-ed to the dependency mask. This is to denote that an aggregate function depends on the number of rows it processes. 3. Handle correctly the case of an aggregate function optimization (such that the aggregate function can be pre-calculated and made a constant). Made sure that an aggregate function is never a constant (unless subject of a specific optimization and pre-calculation). One other flaw was revealed and fixed in the process : references were not calling the recalculation method for used_tables of their targets.
[16 Mar 2007 8:27]
Georgi Kodinov
Moving back to in-progress due to merge problems.
[20 Mar 2007 17:46]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/22395 ChangeSet@1.2400, 2007-03-20 19:46:02+02:00, gkodinov@macbook.local +10 -0 Bug #24484: To correctly decide which predicates can be evaluated with a given table the optimizer must know the exact set of tables that a predicate depends on. If that mask is too wide (refer to non-existing tables) the optimizer can erroneously skip a predicate. One such case of wrong table usage mask were the aggregate functions. The have a all-1 mask (meaning depend on all tables, including non-existent ones). Fixed by making a real used_tables mask for the aggregates. The mask is constructed in the following way : 1. OR the table dependency masks of all the arguments of the aggregate. 2. If all the arguments of the function are from the local name resolution context and it is evaluated in the same name resolution context where it is referenced all the tables from that name resolution context are OR-ed to the dependency mask. This is to denote that an aggregate function depends on the number of rows it processes. 3. Handle correctly the case of an aggregate function optimization (such that the aggregate function can be pre-calculated and made a constant). Made sure that an aggregate function is never a constant (unless subject of a specific optimization and pre-calculation). One other flaw was revealed and fixed in the process : references were not calling the recalculation method for used_tables of their targets.
[23 Mar 2007 13:59]
Alexey Botchkov
Pushed in 5.0.40 and 5.1.18
[7 Apr 2007 19:10]
Paul DuBois
Noted in 5.0.40, 5.1.18 changelogs. A problem in handling of aggregate functions in subqueries caused predicates containing aggregate functions to be ignored during query execution.