Bug #51057 Weak code in diff_tables.inc can lead to 100% CPU consumption
Submitted: 10 Feb 2010 11:44 Modified: 3 May 2010 20:00
Reporter: Matthias Leich Email Updates:
Status: Patch approved Impact on me:
None 
Category:MySQL Server: Tests Severity:S3 (Non-critical)
Version:5.0+ OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Tags: RQG

[10 Feb 2010 11:44] Matthias Leich
Description:
Script showing the bad property of diff_tables.inc:
---------------------------------------------------
# Scenario where one or both of the assigned
# tables does not exist
let $diff_table_1= does_not_exist.does_not_exist;
let $diff_table_2= does_not_exist.does_not_exist;

--source include/diff_tables.inc

=> 1 CPU core is 100% busy
   The testcase timeout (default 900 s) assigned
   to MTR will stop this state later.

Weak code from diff_tables.inc when
the assigned table does not exist
------------------------------------
  let $_diff_column_index=`SELECT MAX(ordinal_position)
                           FROM information_schema.columns
                           WHERE CONCAT(table_schema, '.', table_name) =
                                 '$_diff_table'`;
  <-- MAX(ordinal_position) is NULL.
      The content of $_diff_column_index is an
      empty string. (*)
  let $_diff_column_list=$_diff_column_index;
  dec $_diff_column_index;
  <-- Bad surprise:
      The the content of $_diff_column_index
      is now -1.    (*)
  while ($_diff_column_index) {
    let $_diff_column_list=$_diff_column_index, $_diff_column_list;
    dec $_diff_column_index;
  }
  <-- while ($_diff_column_index) is TRUE as long
      as $_diff_column_index <> 0, but we will never
      reach 0
(*) IMHO this is unfortunate design of mysqltest.
    - In case the query has NULL as result than
      the $variable should contain the string NULL.
    - dec $variable should throw an error and
      abort test execution in case the content of
      the $variable is an empty string or any
      string <> integer number.
We might get a similar bad effect in case the
information_schema related server code or the
optimizer is from whatever reason broken.

Where and how often is this behaviour of
diff_tables.inc fatal:
----------------------------------------
Regression tests:
   It is rather unlikely that some unfortunate
   modification of a test or the server causes
   that a table does not exist.
   From this point of view diff_tables.inc is
   a bit weak but this could maybe ignored.
RQG tests for replication with 1 thread which
show a difference between content of master
and slave at their end:
   The usual procedure is to convert such
   a test into a MTR test which sources
   diff_tables.inc at its end and to start
   the automatic simplification.
   Here the weakness becomes fatal.
   A test with an initial size of 100000 lines
   requires often between 100 - 300 
   simplification attempts and it is very likely
   that several of these attempts cause that
   some table does not exist. 

How to repeat:
See above

Suggested fix:
I suggest to fix in the following way:
1. Modify diff_tables.inc independent of what
   happens with mysqltest
   Reason:
      - Changes in mysqltest will not help in case
        the SELECT on the information_schema
        provides the wrong result 0.
      - It is unclear if (!impact on existing tests!)
        and when mysqltest will be improved.
2. Add a check which leads to abort of execution in
   case of SELECT '$_diff_column_index' <= 0
3. Maybe omit to fix this bug in 5.0
[10 Feb 2010 12:05] Matthias Leich
Proposed fix:
-------------
diff include/diff_tables.inc include/diff_tables_new.inc
96a97,105
>   # Avoid Bug#51057 Weak code in diff_tables.inc can lead to 100% CPU consumption
>   if (`SELECT '$_diff_column_index' <= 0`)
>   {
>      --echo # ERROR in include/diff_tables.inc
>      --echo # '\$_diff_column_index' =  '$_diff_column_index' <= 0
>      --echo # Maybe the table $_diff_table does not exist.
>      --echo # abort
>      exit;
>   }
[10 Feb 2010 13:10] Matthias Leich
Good hint by Philip:
--------------------
"--die" leads to nice output in window where the test
was started.

My test:
--------
"--die" replaces "exit;"
...
main.ml003                               [ fail ]
        Test ended at 2010-02-10 14:08:49

CURRENT_TEST: main.ml003
mysqltest: In included file "./include/diff_tables_new.inc": At line 104:

The result from queries just before the failure was:
Comparing tables does_not_exist.does_not_exist and does_not_exist.does_not_exist
# ERROR in include/diff_tables.inc
# '$_diff_column_index' =  '' <= 0
# Maybe the table does_not_exist.does_not_exist does not exist.
# abort
[12 Feb 2010 11:24] Matthias Leich
Additional weak part which should be fixed:
   Ensure that in case of replication master and
   slave connection have the same @@session.time_zone
   value when selecting form the table.
[30 Apr 2010 15:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/107109

3362 Matthias Leich	2010-04-29
      Fix for Bug#51057 Weak code in diff_tables.inc can lead to 100% CPU consumption
[30 Apr 2010 15:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/107112

3362 Matthias Leich	2010-04-30
      Fix for Bug#51057 Weak code in diff_tables.inc can lead to 100% CPU consumption
[30 Apr 2010 15:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/107113

3362 Matthias Leich	2010-04-30
      Fix for Bug#51057 Weak code in diff_tables.inc can lead to 100% CPU consumption
[3 May 2010 13:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/107200

3363 Matthias Leich	2010-05-03
      Fix for Bug#51057 Weak code in diff_tables.inc can lead to 100% CPU consumption
      1. Add check if the tables to be compared exist
      2. Add check which ensures that we never go into the while loop
         with a value which is obvious wrong or might cause endless looping.
         (Assume the select on the information_schema gives from whatever
         reason a dangerous value)
      3. Minor beautification
[3 May 2010 20:00] Patrick Crews
Tested with the provided test case as well as with existing MTR tests using diff_tables.  Patch fixes bug and has no discernible impact on current tests.
[10 May 2010 14:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/107865

3377 Matthias Leich	2010-05-10 [merge]
      Merge fix for Bug#51057 into actual tree
[10 May 2010 14:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/107871

3377 Matthias Leich	2010-05-10
      Fix for Bug#51057 Weak code in diff_tables.inc can lead to 100% CPU consumption
      1. Add check if the tables to be compared exist
      2. Add check which ensures that we never go into the while loop
         with a value which is obvious wrong or might cause endless looping.
         (Assume the select on the information_schema gives from whatever
         reason a dangerous value)
      3. Minor beautification
[20 May 2010 10:46] Matthias Leich
Patch is queued to mysql-pe.
[28 May 2010 6:19] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:alik@sun.com-20100524190409-5w4l7mje1wk1c90l) (merge vers: 6.0.14-alpha) (pib:16)