Bug #88006 Sudoku solver with recursive CTE never finishes
Submitted: 6 Oct 2017 4:44 Modified: 9 Oct 2017 12:13
Reporter: Vadim Tkachenko Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.3 OS:Any
Assigned to: CPU Architecture:Any

[6 Oct 2017 4:44] Vadim Tkachenko
Description:
I am using a Sudoku solver with recursive CTE adopted from SQLite version there:
https://sqlite.org/lang_with.html

The original query (for SQLite):

WITH RECURSIVE
  input(sud) AS (
    VALUES('53..7....6..195....98....6.8...6...34..8.3..17...2...6.6....28....419..5....8..79')
  ),
  digits(z, lp) AS (
    VALUES('1', 1)
    UNION ALL SELECT
    CAST(lp+1 AS TEXT), lp+1 FROM digits WHERE lp<9
  ),
  x(s, ind) AS (
    SELECT sud, instr(sud, '.') FROM input
    UNION ALL
    SELECT
      substr(s, 1, ind-1) || z || substr(s, ind+1),
      instr( substr(s, 1, ind-1) || z || substr(s, ind+1), '.' )
     FROM x, digits AS z
    WHERE ind>0
      AND NOT EXISTS (
            SELECT 1
              FROM digits AS lp
             WHERE z.z = substr(s, ((ind-1)/9)*9 + lp, 1)
                OR z.z = substr(s, ((ind-1)%9) + (lp-1)*9 + 1, 1)
                OR z.z = substr(s, (((ind-1)/3) % 3) * 3
                        + ((ind-1)/27) * 27 + lp
                        + ((lp-1) / 3) * 6, 1)
         )
  )
SELECT s FROM x WHERE ind=0;

And my adaptation for MySQL 8.0.3 is following:

WITH RECURSIVE
  input(sud) AS (
    SELECT '53..7....6..195....98....6.8...6...34..8.3..17...2...6.6....28....419..5....8..79'
  ),
  digits(z, lp) AS (
    SELECT '1', 1
    UNION ALL SELECT
    CAST(lp+1 AS CHAR), lp+1 FROM digits WHERE lp<9
  ),
  x(s, ind) AS (
    SELECT sud, instr(sud, '.') FROM input
    UNION ALL
    SELECT
      concat(substr(s, 1, ind-1) , z , substr(s, ind+1)),
      instr( concat(substr(s, 1, ind-1) ,z ,substr(s, ind+1)), '.' )
     FROM x, digits AS z
    WHERE ind>0
      AND NOT EXISTS (
            SELECT 1
              FROM digits AS lp
             WHERE z.z = substr(s, ((ind-1)/9)*9 + lp, 1)
                OR z.z = substr(s, ((ind-1)%9) + (lp-1)*9 + 1, 1)
                OR z.z = substr(s, (((ind-1)/3) % 3) * 3
                        + ((ind-1)/27) * 27 + lp
                        + ((lp-1) / 3) * 6, 1)
         )
  )
SELECT s FROM x WHERE ind=0;

The problem is that SQLite finishes its query in less then 1 second.

MySQL version is never finishes and I see that mysqld process eats more and more RSS memory.

So my guess there is something wrong with how MySQL executes that query.

How to repeat:
Run query for MySQL

Suggested fix:
Make query to execute within 1 sec
[6 Oct 2017 5:52] MySQL Verification Team
Hello Vadim,

Thank you for the report and feedback!

Regards,
Umesh
[6 Oct 2017 11:32] Peter Zaitsev
Hm,

I think especially unbound memory hog is a problem which can be used to cause DOS.

I thought there is a limit in MySQL 8 which prevents recursion from running forever
[7 Oct 2017 15:56] Guilhem Bichot
Posted by developer:
 
Hello Vadim.
Coincidentally, I had also made this Sudoku test when I developed CTEs :-)
The problem in your query is that it uses "/" when it rather wants "integer division"; compare:
mysql> select 4/3, 4 div 3;
+--------+---------+
| 4/3    | 4 div 3 |
+--------+---------+
| 1.3333 |       1 |
+--------+---------+
So, the WHERE clause in the subquery, which is supposed to mean "WHERE the Sudoku rules are not violated", allows way more layouts than it should, so the next step explores all those layouts, and adds even more wrong layouts, exponentially. For example iteration 7 takes 10 secs on my machine, iteration 8 takes 23 secs. The default limit of 1000 iterations is not reached here, machine crawls down before.

So we have here a query which wants to produce a lot of rows and does so starting with the first iterations, MySQL obeys... Against such subtly wrong queries, cte_max_recursion_depth's default may be too high; but using max_execution_time would have blocked the problem.
After changing the 4 "/" symbols to DIV, the query finishes in ~1 second (on my debug binary, probably much less on a release binary).
[7 Oct 2017 16:01] Guilhem Bichot
Posted by developer:
 
Complement to my previous post. Making an analogy: a recursive CTE is similar to a WHILE loop in a stored procedure, which would insert into a table; if the WHILE loop is written in a way where it produces more rows and has a logical flow in the testing of the loop condition, the table will grow until the machine gets a problem.
Does that make sense?
[7 Oct 2017 18:43] Vadim Tkachenko
Thanks Guilhem

I did not take into account that "/" behaves differently in SQLite and MySQL
[9 Oct 2017 12:13] Guilhem Bichot
Posted by developer:
 
Out of curiosity, I tried "select integer_column/integer_constant from table" on various DBMSs; some do integer division (e.g. MS SQL), others do decimal division (e.g. Oracle). If I understand the SQL standard correctly, the behaviour is implementation-defined anyway.