Bug #30237 | Performance regression in boolean expressions | ||
---|---|---|---|
Submitted: | 3 Aug 2007 22:10 | Modified: | 9 Oct 2007 17:07 |
Reporter: | Marc ALFF | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Parser | Severity: | S5 (Performance) |
Version: | 5.0 | OS: | Any |
Assigned to: | Marc ALFF | CPU Architecture: | Any |
[3 Aug 2007 22:10]
Marc ALFF
[8 Aug 2007 21:48]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/32275 ChangeSet@1.2577, 2007-08-08 15:48:01-06:00, malff@weblab.(none) +5 -0 Bug#30237 (Performance regression in boolean expressions) NOT A FULL PATCH -- tests scripts used with mysqlslap (only available in 5.1) For reference only, not to push.
[8 Aug 2007 21:54]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/32276 ChangeSet@1.2491, 2007-08-08 15:54:52-06:00, malff@weblab.(none) +2 -0 Bug#30237 (Performance regression in boolean expressions) This is a performance bug, related to the parsing or 'OR' and 'AND' boolean expressions. Let N be the number of expressions involved in a OR (respectively AND). When N=1 For example, "select 1" involve only 1 term: there is no OR operator. In 4.0 and 4.1, parsing expressions not involving OR had no overhead. In 5.0, parsing adds some overhead, with Select->expr_list. With this patch, the overhead introduced in 5.0 has been removed, so that performances for N=1 should be identical to the 4.0 performances, which are optimal (there is no code executed at all) The overhead in 5.0 was in fact affecting significantly some operations. For example, loading 1 Million rows into a table with INSERTs, for a table that has 100 columns, leads to parsing 100 Millions of expressions, which means that the overhead related to Select->expr_list is executed 100 Million times ... Considering that N=1 is by far the most probable expression, this case should be optimal. When N=2 For example, "select a OR b" involves 2 terms in the OR operator. In 4.0 and 4.1, parsing expressions involving 2 terms created 1 Item_cond_or node, which is the expected result. In 5.0, parsing these expression also produced 1 node, but with some extra overhead related to Select->expr_list : creating 1 list in Select->expr_list and another in Item_cond::list is inefficient. With this patch, the overhead introduced in 5.0 has been removed so that performances for N=2 should be identical to the 4.0 performances. Note that the memory allocation uses the new (thd->mem_root) syntax directly. The cost of "is_cond_or" is estimated to be neglectable: the real problem of the performance degradation comes from unneeded memory allocations. When N>=3 For example, "select a OR b OR c ...", which involves 3 or more terms. In 4.0 and 4.1, the parser had no significant cost overhead, but produced an Item tree which is difficult to evaluate / optimize during runtime. In 5.0, the parser produces a better Item tree, using the Item_cond constructor that accepts a list of children directly, but at an extra cost related to Select->expr_list. With this patch, the code is implemented to take the best of the two implementations: - there is no overhead with Select->expr_list - the Item tree generated is optimized and flattened. This is achieved by adding children nodes into the Item tree directly, with Item_cond::add(), which avoids the need for temporary lists and memory allocation Note that this patch also provide an extra optimization, that the previous code in 5.0 did not provide: expressions are flattened in the Item tree, based on what the expression already parsed is, and not based on the order in which rules are reduced. For example : "(a OR b) OR c", "a OR (b OR c)" would both be represented with 2 Item_cond_or nodes before this patch, and with 1 node only with this patch. The logic used is based on the mathematical properties of the OR operator (it's associative), and produces a simpler tree. Performance tests: Tests have been executed for N=1, that show an improvement. When executing repeatedly INSERT <100 values> into <blackhole table>, the timing is as follows (time mysqlslap ...) See the bug report for 30327 for the scripts used, these are not part of the patch. 100,000 inserts of 100 values (10 Million expressions) Before real 0m57.767s user 0m3.088s sys 0m3.188s After real 0m50.722s user 0m3.332s sys 0m3.352s 1,000,000 inserts of 100 values (100 Million expressions) Before real 9m14.565s user 0m33.390s sys 0m35.194s After real 8m30.286s user 0m33.410s sys 0m35.438s Performance measures for N=2 or N>=3 are not available. The code is expected to be faster (per code analysis). Review note: The grammar contains recursive rules around <expr>, which causes shift/reduce conflicts. The rules added for this patch are conflict free (verified independently), but existing ambiguities cause new S/R conflicts to be inferred, as is currently the case for all rules below <expr> (see sql_yacc.output). This patch is safe for the grammar. The grammar should be cleaned up independently of this change.
[21 Aug 2007 18:04]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/32825 ChangeSet@1.2491, 2007-08-21 12:03:44-06:00, malff@weblab.(none) +4 -0 Bug#30237 (Performance regression in boolean expressions) This is a performance bug, related to the parsing or 'OR' and 'AND' boolean expressions. Let N be the number of expressions involved in a OR (respectively AND). When N=1 For example, "select 1" involve only 1 term: there is no OR operator. In 4.0 and 4.1, parsing expressions not involving OR had no overhead. In 5.0, parsing adds some overhead, with Select->expr_list. With this patch, the overhead introduced in 5.0 has been removed, so that performances for N=1 should be identical to the 4.0 performances, which are optimal (there is no code executed at all) The overhead in 5.0 was in fact affecting significantly some operations. For example, loading 1 Million rows into a table with INSERTs, for a table that has 100 columns, leads to parsing 100 Millions of expressions, which means that the overhead related to Select->expr_list is executed 100 Million times ... Considering that N=1 is by far the most probable expression, this case should be optimal. When N=2 For example, "select a OR b" involves 2 terms in the OR operator. In 4.0 and 4.1, parsing expressions involving 2 terms created 1 Item_cond_or node, which is the expected result. In 5.0, parsing these expression also produced 1 node, but with some extra overhead related to Select->expr_list : creating 1 list in Select->expr_list and another in Item_cond::list is inefficient. With this patch, the overhead introduced in 5.0 has been removed so that performances for N=2 should be identical to the 4.0 performances. Note that the memory allocation uses the new (thd->mem_root) syntax directly. The cost of "is_cond_or" is estimated to be neglectable: the real problem of the performance degradation comes from unneeded memory allocations. When N>=3 For example, "select a OR b OR c ...", which involves 3 or more terms. In 4.0 and 4.1, the parser had no significant cost overhead, but produced an Item tree which is difficult to evaluate / optimize during runtime. In 5.0, the parser produces a better Item tree, using the Item_cond constructor that accepts a list of children directly, but at an extra cost related to Select->expr_list. With this patch, the code is implemented to take the best of the two implementations: - there is no overhead with Select->expr_list - the Item tree generated is optimized and flattened. This is achieved by adding children nodes into the Item tree directly, with Item_cond::add(), which avoids the need for temporary lists and memory allocation Note that this patch also provide an extra optimization, that the previous code in 5.0 did not provide: expressions are flattened in the Item tree, based on what the expression already parsed is, and not based on the order in which rules are reduced. For example : "(a OR b) OR c", "a OR (b OR c)" would both be represented with 2 Item_cond_or nodes before this patch, and with 1 node only with this patch. The logic used is based on the mathematical properties of the OR operator (it's associative), and produces a simpler tree. Performance tests: Tests have been executed for N=1, that show an improvement. When executing repeatedly INSERT <100 values> into <blackhole table>, the timing is as follows (time mysqlslap ...) See the bug report for 30327 for the scripts used, these are not part of the patch. 100,000 inserts of 100 values (10 Million expressions) Before real 0m57.767s user 0m3.088s sys 0m3.188s After real 0m50.722s user 0m3.332s sys 0m3.352s 1,000,000 inserts of 100 values (100 Million expressions) Before real 9m14.565s user 0m33.390s sys 0m35.194s After real 8m30.286s user 0m33.410s sys 0m35.438s Performance measures for N=2 or N>=3 are not available. The code is expected to be faster (per code analysis). The precedence of INTERVAL_SYM expr has been explicitely defined, to resolve shift/reduce conflicts in the grammar around interval_expr.
[22 Aug 2007 16:35]
Konstantin Osipov
Reviewed over email.
[22 Aug 2007 17:44]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/32902 ChangeSet@1.2491, 2007-08-22 11:04:10-06:00, malff@weblab.(none) +4 -0 Bug#30237 (Performance regression in boolean expressions) This is a performance bug, related to the parsing or 'OR' and 'AND' boolean expressions. Let N be the number of expressions involved in a OR (respectively AND). When N=1 For example, "select 1" involve only 1 term: there is no OR operator. In 4.0 and 4.1, parsing expressions not involving OR had no overhead. In 5.0, parsing adds some overhead, with Select->expr_list. With this patch, the overhead introduced in 5.0 has been removed, so that performances for N=1 should be identical to the 4.0 performances, which are optimal (there is no code executed at all) The overhead in 5.0 was in fact affecting significantly some operations. For example, loading 1 Million rows into a table with INSERTs, for a table that has 100 columns, leads to parsing 100 Millions of expressions, which means that the overhead related to Select->expr_list is executed 100 Million times ... Considering that N=1 is by far the most probable expression, this case should be optimal. When N=2 For example, "select a OR b" involves 2 terms in the OR operator. In 4.0 and 4.1, parsing expressions involving 2 terms created 1 Item_cond_or node, which is the expected result. In 5.0, parsing these expression also produced 1 node, but with some extra overhead related to Select->expr_list : creating 1 list in Select->expr_list and another in Item_cond::list is inefficient. With this patch, the overhead introduced in 5.0 has been removed so that performances for N=2 should be identical to the 4.0 performances. Note that the memory allocation uses the new (thd->mem_root) syntax directly. The cost of "is_cond_or" is estimated to be neglectable: the real problem of the performance degradation comes from unneeded memory allocations. When N>=3 For example, "select a OR b OR c ...", which involves 3 or more terms. In 4.0 and 4.1, the parser had no significant cost overhead, but produced an Item tree which is difficult to evaluate / optimize during runtime. In 5.0, the parser produces a better Item tree, using the Item_cond constructor that accepts a list of children directly, but at an extra cost related to Select->expr_list. With this patch, the code is implemented to take the best of the two implementations: - there is no overhead with Select->expr_list - the Item tree generated is optimized and flattened. This is achieved by adding children nodes into the Item tree directly, with Item_cond::add(), which avoids the need for temporary lists and memory allocation Note that this patch also provide an extra optimization, that the previous code in 5.0 did not provide: expressions are flattened in the Item tree, based on what the expression already parsed is, and not based on the order in which rules are reduced. For example : "(a OR b) OR c", "a OR (b OR c)" would both be represented with 2 Item_cond_or nodes before this patch, and with 1 node only with this patch. The logic used is based on the mathematical properties of the OR operator (it's associative), and produces a simpler tree. Performance tests: Tests have been executed for N=1, that show an improvement. When executing repeatedly INSERT <100 values> into <blackhole table>, the timing is as follows (time mysqlslap ...) See the bug report for 30327 for the scripts used, these are not part of the patch. 100,000 inserts of 100 values (10 Million expressions) Before real 0m57.767s user 0m3.088s sys 0m3.188s After real 0m50.722s user 0m3.332s sys 0m3.352s 1,000,000 inserts of 100 values (100 Million expressions) Before real 9m14.565s user 0m33.390s sys 0m35.194s After real 8m30.286s user 0m33.410s sys 0m35.438s Performance measures for N=2 or N>=3 are not available. The code is expected to be faster (per code analysis). The precedence of INTERVAL_SYM expr has been explicitely defined, to resolve shift/reduce conflicts in the grammar around interval_expr.
[22 Aug 2007 17:44]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/32903 ChangeSet@1.2491, 2007-08-22 11:05:35-06:00, malff@weblab.(none) +4 -0 Bug#30237 (Performance regression in boolean expressions) This is a performance bug, related to the parsing or 'OR' and 'AND' boolean expressions. Let N be the number of expressions involved in a OR (respectively AND). When N=1 For example, "select 1" involve only 1 term: there is no OR operator. In 4.0 and 4.1, parsing expressions not involving OR had no overhead. In 5.0, parsing adds some overhead, with Select->expr_list. With this patch, the overhead introduced in 5.0 has been removed, so that performances for N=1 should be identical to the 4.0 performances, which are optimal (there is no code executed at all) The overhead in 5.0 was in fact affecting significantly some operations. For example, loading 1 Million rows into a table with INSERTs, for a table that has 100 columns, leads to parsing 100 Millions of expressions, which means that the overhead related to Select->expr_list is executed 100 Million times ... Considering that N=1 is by far the most probable expression, this case should be optimal. When N=2 For example, "select a OR b" involves 2 terms in the OR operator. In 4.0 and 4.1, parsing expressions involving 2 terms created 1 Item_cond_or node, which is the expected result. In 5.0, parsing these expression also produced 1 node, but with some extra overhead related to Select->expr_list : creating 1 list in Select->expr_list and another in Item_cond::list is inefficient. With this patch, the overhead introduced in 5.0 has been removed so that performances for N=2 should be identical to the 4.0 performances. Note that the memory allocation uses the new (thd->mem_root) syntax directly. The cost of "is_cond_or" is estimated to be neglectable: the real problem of the performance degradation comes from unneeded memory allocations. When N>=3 For example, "select a OR b OR c ...", which involves 3 or more terms. In 4.0 and 4.1, the parser had no significant cost overhead, but produced an Item tree which is difficult to evaluate / optimize during runtime. In 5.0, the parser produces a better Item tree, using the Item_cond constructor that accepts a list of children directly, but at an extra cost related to Select->expr_list. With this patch, the code is implemented to take the best of the two implementations: - there is no overhead with Select->expr_list - the Item tree generated is optimized and flattened. This is achieved by adding children nodes into the Item tree directly, with Item_cond::add(), which avoids the need for temporary lists and memory allocation Note that this patch also provide an extra optimization, that the previous code in 5.0 did not provide: expressions are flattened in the Item tree, based on what the expression already parsed is, and not based on the order in which rules are reduced. For example : "(a OR b) OR c", "a OR (b OR c)" would both be represented with 2 Item_cond_or nodes before this patch, and with 1 node only with this patch. The logic used is based on the mathematical properties of the OR operator (it's associative), and produces a simpler tree.
[7 Sep 2007 8:09]
Bugs System
Pushed into 5.1.23-beta
[7 Sep 2007 8:10]
Bugs System
Pushed into 5.0.50
[9 Oct 2007 17:07]
Paul DuBois
Noted in 5.0.50, 5.1.23 changelogs. Server parser performance was improved for boolean expressions.