Bug #42415 UPDATE/DELETE with LIMIT clause unsafe for SBL even with ORDER BY PK clause
Submitted: 28 Jan 2009 12:06 Modified: 10 Jan 2013 14:23
Reporter: Vemund Østgaard Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.1.30, 5.1.46 OS:Any
Assigned to: Assigned Account CPU Architecture:Any

[28 Jan 2009 12:06] Vemund Østgaard
Description:
The documentation states that any UPDATE statement with LIMIT clause is considered unsafe since the order of the rows affected is not defined: http://dev.mysql.com/doc/refman/5.1/en/replication-features-limit.html

However, if "ORDER BY PK" is used, the order of rows is defined and such a statement could be logged in statement format without any warning. 

How to repeat:
mysql> CREATE TABLE t (a INT, b INT, c INT, PRIMARY KEY(a,b));
Query OK, 0 rows affected (0.02 sec)

mysql> CREATE SQL SECURITY INVOKER VIEW v (x,y) AS SELECT a, b FROM t;
Query OK, 0 rows affected (0.00 sec)

mysql> UPDATE v SET x=x+1 ORDER BY y,x LIMIT 1;
Query OK, 0 rows affected, 1 warning (0.00 sec)
Rows matched: 0  Changed: 0  Warnings: 0

mysql> show warnings;
+---------+------+---------------------------------------------------+
| Level   | Code | Message                                           |
+---------+------+---------------------------------------------------+
| Warning | 1592 | Statement is not safe to log in statement format. |
+---------+------+---------------------------------------------------+
1 row in set (0.00 sec)
[7 Feb 2009 21:18] Mark Plomer
Yes, and not only if the Primary Key field(s) are in ORDER BY, but also if PK is in WHERE clause, then an additional LIMIT does not make the statement insecure.
Some applications like phpMyAdmin adds a "LIMIT 1" to each statement event if the PK is in WHERE clause and generates a lot of warnings.
[6 May 2009 9:28] Lars Thalmann
On the replication architecture team meeting today, 
the following was decided:
   
   DECISION:
   - For BUG#42415, we will change the warning message and make sure
     the manual has correct text about a) what safe means, b) when the
     warning is issued.
   - After this, the bug will be closed as "Won't fix"
   
   DECISION:
   - The warning message "Statement is not safe to log in statement
     format." should be changed to "The statement may not be safe to
     log in statement format".
   - Document in the manual that:
     "Note that the decision on whether a statement is safe is
     heuristic. This means that the server will switch (if format is
     MIXED) to row-based logging for statements that it cannot
     guarantee that they are safe to log in statement-based format."
[7 May 2009 6:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/73556
[19 May 2009 8:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/74458
[20 May 2009 8:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/74544
[20 May 2009 10:30] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/74576

2895 He Zhenxing	2009-05-20
      Post fix of result file after push of BUG#42415
[28 May 2009 8:18] Bugs System
Pushed into 5.1.36 (revid:joro@sun.com-20090528073639-yohsb4q1jzg7ycws) (version source revid:matthias.leich@sun.com-20090520195829-2q0xhtjqt4gc8htf) (merge vers: 5.1.36) (pib:6)
[28 May 2009 14:33] Jon Stephens
Documented bugfix in the 5.1.36 changelog as follows:

        The warning "Statement is not safe to log in statement format",
        issued in situations when it cannot be determined whether a
        statement can be written reliably to the binary log, has been
        changed to "Statement may not be safe to log in statement 
        format".

Also updated "Binary Logging Formats" with info contained in comments from Lars and others.

Set NDI status, waiting for post-5.1 push.
[28 May 2009 15:39] Jon Stephens
Commit with the documentation changes is here:http://lists.mysql.com/commits/75164
[17 Jun 2009 19:25] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090616183122-chjzbaa30qopdra9) (version source revid:matthias.leich@sun.com-20090520192627-94166m3dcjv48a6g) (merge vers: 6.0.12-alpha) (pib:11)
[24 Jun 2009 15:41] Jon Stephens
Bugfix also documented in the 5.4.4 changelog/5.4 Manual. Closed.
[12 Aug 2009 22:06] Paul DuBois
Noted in 5.4.2 changelog because next 5.4 version will be 5.4.2 and not 5.4.4.
[12 Aug 2009 22:07] Paul DuBois
Noted in 5.4.2 changelog because next 5.4 version will be 5.4.2 and not 5.4.4.
[14 Aug 2009 22:55] Paul DuBois
Ignore previous comment about 5.4.2.
[14 Aug 2009 22:57] Paul DuBois
Ignore previous comment about 5.4.2.
[26 Aug 2009 13:46] Bugs System
Pushed into 5.1.37-ndb-7.0.8 (revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[26 Aug 2009 13:46] Bugs System
Pushed into 5.1.37-ndb-6.3.27 (revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (version source revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (merge vers: 5.1.37-ndb-6.3.27) (pib:11)
[26 Aug 2009 13:48] Bugs System
Pushed into 5.1.37-ndb-6.2.19 (revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (version source revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (merge vers: 5.1.37-ndb-6.2.19) (pib:11)
[27 Aug 2009 16:33] Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:magnus.blaudd@sun.com-20090827163030-6o3kk6r2oua159hr) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[7 Oct 2009 14:35] Paul DuBois
The 5.4 fix has been pushed to 5.4.2.
[19 May 2010 6:33] Roel Van de Paar
Could it be clarified why mysqld cannot determine that UPDATE/DELETE ... ORDER BY PK ... LIMIT ... is a safe statement?

We now have a customer that has "billions of messages" in their error log for simple statements like:

100517 17:28:12 [Warning] Statement may not be safe to log in statement format. Statement: DELETE FROM abc WHERE col1 = 1 LIMIT 1000
100517 17:28:12 [Warning] Statement may not be safe to log in statement format. Statement: DELETE FROM abc WHERE col1 = 1 LIMIT 1000
100517 17:28:12 [Warning] Statement may not be safe to log in statement format. Statement: DELETE FROM xyz WHERE col1 = 1 LIMIT 1000
100517 17:28:12 [Warning] Statement may not be safe to log in statement format. Statement: DELETE FROM xyz WHERE col1 = 1 LIMIT 1000

These statements are safe if an ORDER BY clause is added, and if the table has a PK. But, mysqld does not recognize this:

--------
mysql> SHOW CREATE TABLE id6\G
*************************** 1. row ***************************
       Table: id6
Create Table: CREATE TABLE `id6` (
  `id` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql> DELETE FROM id6 ORDER BY id LIMIT 2;
Query OK, 2 rows affected, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;
+-------+------+-------------------------------------------------------+
| Level | Code | Message                                               |
+-------+------+-------------------------------------------------------+
| Note  | 1592 | Statement may not be safe to log in statement format. |
+-------+------+-------------------------------------------------------+
1 row in set (0.00 sec)
--------

We can tell users/customers to swap to MIXED mode, but the result is that instead of their log files being filled with many messages, now the disk will be filled with data due to RBL space requirements:

======
STATEMENT based logging:
delete from a limit 2000;
19/05/2010  04:30 PM               235 log-bin.000069

ROW based logging:
delete from a limit 2000;
19/05/2010  04:31 PM            37,374 log-bin.000075
======

Also, this needs to be re-evaluated against bug #42851 , bug #49392 and bug #46265
[19 May 2010 6:53] Roel Van de Paar
In summary: this seems to affect:
UPDATE ... SET  ... ORDER BY ... LIMIT ...; where ORDER BY is on PK
DELETE ... FROM ... ORDER BY ... LIMIT ...; where ORDER BY is on PK
UPDATE ... SET  ... WHERE    ... LIMIT ...; where WHERE is on PK
DELETE ... SET  ... WHERE    ... LIMIT ...; where WHERE is on PK
[19 May 2010 7:24] Roel Van de Paar
This is where it all started: bug #34768
[19 May 2010 7:31] Roel Van de Paar
Also, ORDER BY does not fix the issue at the moment, as discussed in this bug and others. Yet, the documentation still states it does: http://dev.mysql.com/doc/refman/5.1/en/replication-features-limit.html
[19 May 2010 7:54] Roel Van de Paar
See [19 May 9:53] comment on bug #42851 as well: we are eliminating both good and bad messages from the log with the log_warnings solution.
[19 May 2010 8:13] Roel Van de Paar
Summary: if statements are not marked correctly as safe or unsafe, the result is either a full error log with warnings (which cannot be turned off correctly because doing so would also eliminate "good" warnings, ref [19 May 9:54] comment) or a disk filling with unnecessary RBL data because RBL was incorrectly chosen, ref [19 May 8:33] comment.
[9 Jun 2010 12:10] Jon Stephens
Roel's comment from 19 May 9:31 is incorrect, as the page in question states:

    Currently, when using STATEMENT mode, warnings are issued for DML 
    statements containing LIMIT even when they also have an ORDER BY 
    clause (and so are made deterministic). This is a known issue. 

This has been in the documentation since February 2009, please refer http://lists.mysql.com/commits/67122.
[6 Jul 2010 1:15] Luis Soares
See also: BUG#42415.
[11 Jul 2010 8:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/113293

3418 He Zhenxing	2010-07-11
      Bug #42415 UPDATE/DELETE with LIMIT clause unsafe for SBL even with ORDER BY PK clause
      
      Before the patch, when using STATEMENT mode, unsafe warnings were
      issued for all DML statements (INSERT...SELECT, UPDATE, DELETE)
      with LIMIT clause due to the possobility of non-deterministic result
      order, even when they also had a ORDER BY primary_key clause. In
      which case the order would be deterministic.
      
      This patch fixed the problem by checking the ORDER BY clause of the
      statement, and do not issue the warning if the result is ordered by
      the primary key (thus deterministic)
[11 Jul 2010 13:29] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/113297

3418 He Zhenxing	2010-07-11
      Bug #42415 UPDATE/DELETE with LIMIT clause unsafe for SBL even with ORDER BY PK clause
      
      Before the patch, when using STATEMENT mode, unsafe warnings were
      issued for all DML statements (INSERT...SELECT, UPDATE, DELETE)
      with LIMIT clause due to the possobility of non-deterministic result
      order, even when they also had a ORDER BY primary_key clause. In
      which case the order would be deterministic.
      
      This patch fixed the problem by checking the ORDER BY clause of the
      statement, and do not issue the warning if the result is ordered by
      the primary key (thus deterministic).
[13 Jul 2010 14:53] Sven Sandberg
see also BUG#55211
[26 Jul 2010 16:47] Sven Sandberg
Note that the value of MAX_SORT_LENGTH affects whether a statement is unsafe or not. Specifically, if column 'pk' is a BLOB or TEXT and there is a primary key on the first N bytes of 'pk', and MAX_SORT_LENGTH < N, then INSERT...SELECT...ORDER BY primary_key...LIMIT is not safe to replicate because the entire primary key is not taken into account by the ORDER BY clause.

==== BEGIN TEST ====
--source include/have_binlog_format_statement.inc
--source include/master-slave.inc
CREATE TABLE t1 (a TEXT, PRIMARY KEY(a(100)));
CREATE TABLE t2 (a TEXT, PRIMARY KEY(a(100)));
SET SESSION MAX_SORT_LENGTH = 4;
INSERT INTO t1 VALUES ('text number 2'), ('text number 1');
SELECT * FROM t1 ORDER BY a LIMIT 1;
--sync_slave_with_master
DELETE FROM t1;
INSERT INTO t1 VALUES ('text number 1'), ('text number 2');
SET GLOBAL MAX_SORT_LENGTH = 4;
STOP SLAVE;
START SLAVE;
--connection master
INSERT INTO t2 SELECT * FROM t1 ORDER BY a LIMIT 1;
SELECT * FROM t2;
--sync_slave_with_master
SELECT * FROM t2;
==== END TEST ====

(An additional problem is that we currently don't replicate MAX_SORT_LENGTH, but that will be addressed in WL#5135.)
[15 Oct 2010 10:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/120819

3418 He Zhenxing	2010-10-15
      Bug #42415 UPDATE/DELETE with LIMIT clause unsafe for SBL even with ORDER BY PK clause
      
      Before the patch, when using STATEMENT mode, unsafe warnings were
      issued for all DML statements (INSERT...SELECT, UPDATE, DELETE)
      with LIMIT clause due to the possobility of non-deterministic result
      order, even when they also had a ORDER BY primary_key clause. In
      which case the order would be deterministic.
      
      This patch fixed the problem by checking the ORDER BY clause of the
      statement, and do not issue the warning if the result is ordered by
      the primary key (thus deterministic).
      ******
      BUG#42415 UPDATE/DELETE with LIMIT clause unsafe for SBL even with ORDER BY PK clause
      
      UPDATE/DELETE/INSERT..SELECT with LIMIT clause were considered unsafe
      unconditionally, even if there is an ORDER BY PK or WHERE condition
      that can guarantee the result to be deterministic.
      
      The problem is fixed by implementing a algorithm to do more elaborate
      analyzed on the nature of the query to determine whether the query
      will cause uncertainty for replication or not.
      
      The statement will not be considered unsafe for following cases:
       - single table UPDATE/DELETE/INSERT..SELECT with ORDER BY
         <non null unique key>
       - single table UPDATE/DELETE/INSERT..SELECT with WHERE clause
         that include <non null unique key> = <const value>, if it is
         a multi-part key, then it must be <keypart1>=<const1> AND
         <keypart2>=<const2> ..., and this condition is ANDed with
         other conditions if there is any.
       - single table INSERT..SELECT with WHERE clause
         that include some parts of the non null unique key compare to
         const values, and the ORDER BY clause includes all the other
         key pars of the same non null unique key. for example (a,b) is
         a non null unique key, then WHERE a=<const> ORDER b will make
         the query result deterministic.
       - for INSERT..SELECT ... JOIN ..., the join condition (ON,
         USING, or WHERE) must include equations between columns of
         non null unique key of tables from both side of the join.
         For example t1 JOIN t2 USING (a,b), if (a,b) is not a non null
         unique key for both t1 and t2, then the result will be non-
         deterministic. otherwise the result can be deterministic with
         appropirate WHERE/ORDER clauses, and in this case, the same
         rule for single table above applys. But there is a difference
         for INNER JOIN with OUTER JOIN, for OUTER JOIN, only one table
         of the two JOIN tables will be used when checking the WHERE/ORDER
         conditions, it's the left table for LEFT JOIN and the right one
         for RIGHT JOIN when checking the keys. On the other hand, for
         INNER JOIN, keys from both tables can be used when checking
         the conditions. For example:
           INSERT..SELECT * FROM t1 INNER JOIN t2 USING(pk) ORDER BY nnuk LIMIT 1;
         This can be safe if nnuk is a non null unique key of either
         t1 or t2. But if we change the INNER JOIN to LEFT JOIN or
         RIGHT JOIN, then nnuk must be a non null unique key key of
         t1 (LEFT JOIN) or t2 (RIGHT JOIN) respectively.
       - If JOIN are nested, the will be handled recursively from inner
         outside.
[15 Oct 2010 10:33] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/120821

3418 He Zhenxing	2010-10-15
      BUG#42415 UPDATE/DELETE with LIMIT clause unsafe for SBL even with ORDER BY PK clause
      
      UPDATE/DELETE/INSERT..SELECT with LIMIT clause were considered unsafe
      unconditionally, even if there is an ORDER BY PK or WHERE condition
      that can guarantee the result to be deterministic.
      
      The problem is fixed by implementing a algorithm to do more elaborate
      analyzed on the nature of the query to determine whether the query
      will cause uncertainty for replication or not.
      
      The statement will not be considered unsafe for following cases:
       - single table UPDATE/DELETE/INSERT..SELECT with ORDER BY
         <non null unique key>
       - single table UPDATE/DELETE/INSERT..SELECT with WHERE clause
         that include <non null unique key> = <const value>, if it is
         a multi-part key, then it must be <keypart1>=<const1> AND
         <keypart2>=<const2> ..., and this condition is ANDed with
         other conditions if there is any.
       - single table INSERT..SELECT with WHERE clause
         that include some parts of the non null unique key compare to
         const values, and the ORDER BY clause includes all the other
         key pars of the same non null unique key. for example (a,b) is
         a non null unique key, then WHERE a=<const> ORDER b will make
         the query result deterministic.
       - for INSERT..SELECT ... JOIN ..., the join condition (ON,
         USING, or WHERE) must include equations between columns of
         non null unique key of tables from both side of the join.
         For example t1 JOIN t2 USING (a,b), if (a,b) is not a non null
         unique key for both t1 and t2, then the result will be non-
         deterministic. otherwise the result can be deterministic with
         appropirate WHERE/ORDER clauses, and in this case, the same
         rule for single table above applys. But there is a difference
         for INNER JOIN with OUTER JOIN, for OUTER JOIN, only one table
         of the two JOIN tables will be used when checking the WHERE/ORDER
         conditions, it's the left table for LEFT JOIN and the right one
         for RIGHT JOIN when checking the keys. On the other hand, for
         INNER JOIN, keys from both tables can be used when checking
         the conditions. For example:
           INSERT..SELECT * FROM t1 INNER JOIN t2 USING(pk) ORDER BY nnuk LIMIT 1;
         This can be safe if nnuk is a non null unique key of either
         t1 or t2. But if we change the INNER JOIN to LEFT JOIN or
         RIGHT JOIN, then nnuk must be a non null unique key key of
         t1 (LEFT JOIN) or t2 (RIGHT JOIN) respectively.
       - If JOIN are nested, the will be handled recursively from inner
         outside.
      ******
      
      BUG#42415 UPDATE/DELETE with LIMIT clause unsafe for SBL even with ORDER BY PK clause
      
      UPDATE/DELETE/INSERT..SELECT with LIMIT clause were considered unsafe
      unconditionally, even if there is an ORDER BY PK or WHERE condition
      that can guarantee the result to be deterministic.
      
      The problem is fixed by implementing a algorithm to do more elaborate
      analyzed on the nature of the query to determine whether the query
      will cause uncertainty for replication or not.
      
      The statement will not be considered unsafe for following cases:
       - single table UPDATE/DELETE/INSERT..SELECT with ORDER BY
         <non null unique key>
       - single table UPDATE/DELETE/INSERT..SELECT with WHERE clause
         that include <non null unique key> = <const value>, if it is
         a multi-part key, then it must be <keypart1>=<const1> AND
         <keypart2>=<const2> ..., and this condition is ANDed with
         other conditions if there is any.
       - single table INSERT..SELECT with WHERE clause
         that include some parts of the non null unique key compare to
         const values, and the ORDER BY clause includes all the other
         key pars of the same non null unique key. for example (a,b) is
         a non null unique key, then WHERE a=<const> ORDER b will make
         the query result deterministic.
       - for INSERT..SELECT ... JOIN ..., the join condition (ON,
         USING, or WHERE) must include equations between columns of
         non null unique key of tables from both side of the join.
         For example t1 JOIN t2 USING (a,b), if (a,b) is not a non null
         unique key for both t1 and t2, then the result will be non-
         deterministic. otherwise the result can be deterministic with
         appropirate WHERE/ORDER clauses, and in this case, the same
         rule for single table above applys. But there is a difference
         for INNER JOIN with OUTER JOIN, for OUTER JOIN, only one table
         of the two JOIN tables will be used when checking the WHERE/ORDER
         conditions, it's the left table for LEFT JOIN and the right one
         for RIGHT JOIN when checking the keys. On the other hand, for
         INNER JOIN, keys from both tables can be used when checking
         the conditions. For example:
           INSERT..SELECT * FROM t1 INNER JOIN t2 USING(pk) ORDER BY nnuk LIMIT 1;
         This can be safe if nnuk is a non null unique key of either
         t1 or t2. But if we change the INNER JOIN to LEFT JOIN or
         RIGHT JOIN, then nnuk must be a non null unique key key of
         t1 (LEFT JOIN) or t2 (RIGHT JOIN) respectively.
       - If JOIN are nested, the will be handled recursively from inner
         outside.
[29 Oct 2010 15:06] Sven Sandberg
Zhenxing found that we can do better: if a table is deterministically ordered, then all columns that appear in some NNUK in that table are in fact O-columns. Moreover, if the table is deterministically ordered because some NNUK covers *only* U-columns, then *all* columns of the table are U-columns. So, e.g.:

t1(a, b, PK(a), PK(b))
t2(c, PK(c))
t1 JOIN t2 ON b = c ORDER BY a
This is deterministic: (1) a is an O-column because of ORDER BY. (2) t1 is deterministic because PK(a) contains only O-columns. (3) b is an O-column because t1 is deterministic and b is part of a key. (4) c is an O-column because of the ON clause. (5) t2 is deterministic because PK(c) contains only O-columns.

t3(d, e, PK(d))
t4(f, PK(f))
t3 JOIN t4 ON e = f WHERE d = 1
This is deterministic: (1) d is a U-column because of WHERE d = 1. (2) t1 is deterministic because PK(d) contains only U-columns. (3) e is a U-column because t1 has a key where all columns are U-columns. (4) f is a U-column because of the ON clause. (5) t2 is deterministic because PK(f) contains only U-columns.
[29 Oct 2010 15:20] Sven Sandberg
We can base an algorithm on marking columns, keys, and tables, and propagating markers according to rules.

Possible markers:
 M1. columns can be marked as O-columns and/or U-columns
 M2. keys can be marked as O-keys and/or U-keys
 M3. tables can be marked as O-table and/or U-tables

Rules. All rules are applied as many times as possible.
 R1. for each "WHERE col = const" or "ON col = const" clause, mark col
     as a U-column
 R2. for each "ORDER BY col" clause, mark col as an O-column
 R3. for each "ON col1 = col2" or "WHERE col1 = col2" clause, copy U-
     and O-marks from col1 to col2 and vice versa
 R4. for each key where all columns are O- or U-columns, mark the key
     as an O-key
 R5. for each key where all columns are U-columns, mark the key as a
     U-key
 R6. for each table where at least one key is an O-key, mark the table
     as an O-table
 R7. for each table where at least one key is a U-key, mark the table
     as a U-table
 R8. for each O-table, mark all columns that are part of some key
     O-columns
 R9. for each U-table, mark all columns as U-columns
R10. if all tables are O-tables or U-tables, then the statement is deterministic
[29 Oct 2010 15:45] Sven Sandberg
==== AND-OR graphs ====

An AND-OR graph is a graph where:
 - each node is either an AND-node or an OR-node
 - each node has a truth value (which can be changed): TRUE or FALSE
To "evaluate" an AND-OR graph means:
 - Take some node N.  If N is an OR-node where some successor is TRUE,
   or N is an AND-node where all successors are TRUE, then mark N as
   TRUE.  Repeat this as long as possible.

==== AND-OR graph for checking if table order is deterministic ====

From the markers M1-M3 and the rules R1-R10 defined abovec, we can derive an AND-OR graph.

Nodes:
1.  OR-nodes: C_ij^O  "if column j of table i is O-column or U-column"
2.  OR-nodes: C_ij^U  "if column j of table i is U-column"
3. AND-nodes: K_ik^O  "if key k of table i is O-key or U-key"
4. AND-nodes: K_ik^U  "if key k of table i is U-key"
5.  OR-nodes: T_i^O   "if table i is O-table or U-table"
6.  OR-nodes: T_i^U   "if table i is U-table"
7.  AND-node: ROOT    "statement is deterministic"

Edges:
a. C_ij^O --> K_ik^O, for all columns C_ij that are part of key K_ij,
                      except columns of CHAR type (etc) where the key is
                      on a prefix longer than MAX_SORT_LENGTH.
b. C_ij^U --> K_ik^O, for all columns C_ij that are part of key K_ij and
                      the column is of CHAR type (etc) and the key is
                      on a prefix longer than MAX_SORT_LENGTH
                      "if all columns in key are O with sufficient prefix or U,
                      then key is O or U"

c. C_ij^U --> K_ik^U, for all columns C_ij part of key K_ij
                      "if all columns in key are U, then key is U"

d. K_ik^O --> T_i^O,  for all tables and NNUKs
                      "if some key in the table is O, then the table is O"

e. K_ik^U --> T_i^U,  for all tables and keys
                      "if some key in the table is U, then the table is U"

f.  T_i^U --> C_ij^U, for all tables and all columns that are part of some key
                      "if table is U, then every column part of some key is U"

g.  T_i^U --> C_ij^O, for all tables and all columns that are part of some key
                      "if table is U, then every column in some key is U or O"

h.  T_i^O --> ROOT,   for all tables
                      "if all tables are O-tables or U-tables, then the ROOT is
                      deterministic"

For each "ON col1 = col2" or "WHERE col1 = col2" clause, add edges:
i. col1^O --> col2^O
j. col1^U --> col2^U
k. col2^O --> col1^O
l. col2^U --> col1^U

For each "WHERE col = const" or "ON col = const" clause:
 - mark col^U-node as TRUE
 - mark col^O-node as TRUE

For each "ORDER BY col" clause:
 - mark col a as TRUE

After we have evaluated this AND-OR graph, we can determine if the
statement is deterministic or not.  The statement is deterministic if
ROOT is TRUE.

==== Algorithm to evaluate AND-OR graph ====

INPUT:
    set of AND-nodes
    set of OR-nodes
    for each node, the successors are known
    for each AND-node, the number of predecessors is known
OUTPUT:
    SUCCESS if ROOT is TRUE, FAILURE if ROOT is FALSE

# Each node N has an integer attribute N.todo.  This counts the number
# of predecessors that must be set to TRUE before N is set to true.
for each AND-node N:
    N.todo := number of predecessors of N
for each OR-node N:
    N.todo := 1
S := empty set
for each node initially marked as TRUE:
    N.todo := 1
    add N to set S
while S != empty set:
    remove some N from S
    N.todo := N.todo - 1
    if N.todo == 0:
        if N == ROOT
            return SUCCESS
        for each successor M of N:
            if M.todo > 0:
                add M to S
return FAILURE

==== Complexity of algorithm ====

The first two loops are iterated at most once for each node.

For each iteration of the while loop, some N.todo is decreased for
some node N where N.todo>0.  Initially, sum of 'todo' over all nodes
is less than the number of edges in the graph.  Hence, the number of
iterations of the while loop is bounded by the number of edges in the
graph.

Each node is processed at most once by the for loop inside the while
loop.  So the total number of iterations taken in this for loop is
bounded by the number of edges of the graph.

Hence, the algorithm is O(number of edges + number of nodes).
[2 Nov 2010 4:04] Roel Van de Paar
See bug #50440 | bug #53079 | bug #48608 | bug #45677
[3 Nov 2010 11:15] Sven Sandberg
The above algorithm adds a node to S each time the 'todo' value needs to be decreased. This assumes S is a multi-set, i.e., nodes can appear more than once in S, and requires O(number of edges) space. We can instead add nodes to S only when the 'todo' value drops to 0. Then S does not need to be a multi-set and it requires only O(number of nodes) space. The algorithm for this is as follows:

for each AND-node N:
    N.todo := number of predecessors of N
for each OR-node N:
    N.todo := 1
S := empty set
for each node initially marked as TRUE:
    N.todo := 0
    add N to set S
while S != empty set:
    remove some N from S
    for each successor M of N:
        if M.todo > 0:
            M.todo := M.todo - 1
            if M.todo == 0:
                if M == ROOT
                    return SUCCESS
                add M to S
return FAILURE
[11 Oct 2012 21:42] Sheeri Cabral
It says a patch has been committed, but the "status" is "patch pending". Was this actually put in a version of MySQL? If so, which version?

I am interested in this fix, and put in my "vote" to get this fixed.
[16 Oct 2012 0:08] James Day
Sheeri, I don't see any sign of the later patches in the current source code for 5.6. The right person to answer this won't be around until next week, so there will probably be a delay in saying more. The suppression of large numbers of duplicate warnings via bug 51638 should at least help.
[10 Jan 2013 14:22] Erlend Dahl
Bug#67804 was marked as a duplicate.
[10 Jan 2013 14:23] Erlend Dahl
There is no patch pending for this.
[25 Feb 2013 3:12] Roel Van de Paar
Erlend, what happened with the fix in progress?