Bug #30625 Performance, reduce depth for expressions
Submitted: 24 Aug 2007 15:28 Modified: 9 Oct 2007 17:05
Reporter: Marc Alff Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Parser Severity:S5 (Performance)
Version:5.0, 5.1 OS:Any
Assigned to: Marc Alff CPU Architecture:Any

[24 Aug 2007 15:28] Marc Alff
Description:
As seen during the investigation of performance issues with bug#29921,
the 5.0 code in the parser grammar contain long chains of reduces:

Reduces for an expression:
NUM_literal: NUM
literal: NUM_literal
simple_expr: literal
factor: simple_expr
term: factor
value_expr: term
bit_factor: value_expr
bit_term: bit_factor
bit_expr: bit_term
predicate: bit_expr
bool_pri: predicate
bool_test: bool_pri
bool_factor: bool_test
(reduce action @71): { Select->expr_list.push_front(new List<Item>); }
bool_and_expr: /* empty */
bool_term: bool_factor bool_and_expr
(reduce action @70): { Select->expr_list.push_front(new List<Item>); }
bool_or_expr: /* empty */
expr: bool_term bool_or_expr
--> 19 reduces

This causes the bison generated code to loop internally in the MYSQLparse()
function, and negatively affects performances.

How to repeat:
Analysis / Gprof

Suggested fix:
Reduce the depth of rules needed to reduce an expression
[28 Aug 2007 16:16] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/33237

ChangeSet@1.2502, 2007-08-28 10:14:35-06:00, malff@weblab.(none) +3 -0
  Bug#30625 (Performance, reduce depth for expressions)
  
  This is a performance bug, affecting in particular the bison generated code
  for the parser.
  
  Prior to this fix, the grammar used a long chain of reduces to parse an
  expression, like:
    bit_expr -> bit_term
    bit_term -> bit_factor
    bit_factor -> value_expr
    value_expr -> term
    term -> factor
  etc
  
  This chain of reduces cause the internal state automaton in the generated
  parser to execute more state transitions and more reduces, so that the
  generated MySQLParse() function would spend a lot of time looping to execute
  all the grammar reductions.
  
  With this patch, the grammar has been reorganized so that rules are more
  "flat", limiting the depth of reduces needed to parse <expr>.
  
  Tests have been written to enforce that relative priorities and properties
  of operators have not changed while changing the grammar.
  
  For a test benchmark (1 Million INSERT of 100 values),
  the time spent inside MYSQLParse() according to gprof is:
  
  5.0.48:
  127.13 seconds
  
  5.0.48 + fix for bug 30237:
  97.10 seconds
  
  5.0.48 + fix for bug 30237 + fix for this bug:
  68.96 seconds
  
  yyparse() in 4.1.24:
  48.48 seconds
[28 Aug 2007 17:17] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/33242

ChangeSet@1.2502, 2007-08-28 11:16:03-06:00, malff@weblab.(none) +3 -0
  Bug#30625 (Performance, reduce depth for expressions)
  
  This is a performance bug, affecting in particular the bison generated code
  for the parser.
  
  Prior to this fix, the grammar used a long chain of reduces to parse an
  expression, like:
    bit_expr -> bit_term
    bit_term -> bit_factor
    bit_factor -> value_expr
    value_expr -> term
    term -> factor
  etc
  
  This chain of reduces cause the internal state automaton in the generated
  parser to execute more state transitions and more reduces, so that the
  generated MySQLParse() function would spend a lot of time looping to execute
  all the grammar reductions.
  
  With this patch, the grammar has been reorganized so that rules are more
  "flat", limiting the depth of reduces needed to parse <expr>.
  
  Tests have been written to enforce that relative priorities and properties
  of operators have not changed while changing the grammar.
  
  See the bug report for performance data.
[5 Sep 2007 0:34] Marc Alff
Performance update

  For a test benchmark (1 Million INSERT of 100 values),
  the time spent inside MYSQLParse() according to gprof is:
  
  5.0.48:
  127.13 seconds
  
  5.0.48 + fix for bug 30237:
  97.10 seconds
  
  5.0.48 + fix for bug 30237 + fix for this bug:
  68.96 seconds
  
  yyparse() in 4.1.24:
  48.48 seconds

So, with a regression of 127.13 - 48.48 = 78.65 seconds for this test
between 4.1 and 5.0, the regression is now reduced to 68.96 - 48.48 = 20.48
seconds.

With this patch combined with bug#30237, 75 percent of the performance
degradation in MYSQLparse() has been reclaimed.
[7 Sep 2007 8:09] Bugs System
Pushed into 5.1.23-beta
[7 Sep 2007 8:10] Bugs System
Pushed into 5.0.50
[9 Oct 2007 17:05] Paul Dubois
Noted in 5.0.50, 5.1.23 changelogs.

Server parser performance was improved for expression parsing by
lowering the number of state state transitions and reductions needed.