Bug #96538 Cannot use ROW_NUMBER() in recursive block of CTE
Submitted: 14 Aug 2019 18:32 Modified: 4 Sep 2019 9:30
Reporter: Anthony Marston Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: DML Severity:S4 (Feature request)
Version:8.0.17 OS:Windows (Windows 7)
Assigned to: CPU Architecture:Any
Tags: cte, ROW_NUMBER()

[14 Aug 2019 18:32] Anthony Marston
Description:
I am try to perform a parts explosion using a recursive CTE. I have a query which uses ROW_NUMBER() in both the anchor block and the recursive block, and this works perfectly with Oracle, SQL Server and PostgreSQL, but fails in MySQL with the message "Recursive CTE can container neither aggregate nor window functions in the recursive block". According to your manual this is disallowed in the SQL standard. If all these other database vendors allow it then I would suggest that your interpretation of the SQL standard needs to be re-examined.

How to repeat:
-- here is the table structure
CREATE TABLE IF NOT EXISTS `bill_of_materials` (
  `product_id_snr` varchar(40) NOT NULL,
  `product_id_jnr` varchar(40) NOT NULL,
  `quantity` decimal(18,9) unsigned NOT NULL DEFAULT '1.000000000',
  PRIMARY KEY (`product_id_snr`,`product_id_jnr`)
) ENGINE=InnoDB;

-- here is some data
INSERT INTO `bill_of_materials` (`product_id_snr`, `product_id_jnr`, `quantity`) VALUES ('SNAFU', 'B-2A_ASSY', 1);
INSERT INTO `bill_of_materials` (`product_id_snr`, `product_id_jnr`, `quantity`) VALUES ('SNAFU', 'TMJ_TEST', 1);
INSERT INTO `bill_of_materials` (`product_id_snr`, `product_id_jnr`, `quantity`) VALUES ('B-2A_ASSY', 'TMJ', 1);
INSERT INTO `bill_of_materials` (`product_id_snr`, `product_id_jnr`, `quantity`) VALUES ('B-2A_ASSY', 'TMJ_ASSY_B', 1);
INSERT INTO `bill_of_materials` (`product_id_snr`, `product_id_jnr`, `quantity`) VALUES ('TMJ', 'TMJ_CP', 1);
INSERT INTO `bill_of_materials` (`product_id_snr`, `product_id_jnr`, `quantity`) VALUES ('TMJ', 'TMJ_WAFER', 1);

-- here is my query which works in Oracle, SQL Server and PostgreSQL
USE `gmx_product`;
WITH RECURSIVE bom (sort_seq, level, product_id_snr, product_id_jnr, quantity)
AS ( SELECT CAST(CONCAT('/', LPAD(ROW_NUMBER() OVER (ORDER BY pc.product_id_snr ASC, pc.product_id_jnr ASC), 4, '0')) AS char(4000)) AS sort_seq
, 1 AS level
, pc.product_id_snr, pc.product_id_jnr, pc.quantity
FROM bill_of_materials AS pc
WHERE product_id_snr='SNAFU'
  UNION ALL
  SELECT CONCAT(bom.sort_seq, '/', LPAD(ROW_NUMBER() OVER (ORDER BY pc.product_id_snr ASC, pc.product_id_jnr ASC), 4, '0')) AS sort_seq
, level+1, pc.product_id_snr, pc.product_id_jnr, pc.quantity
FROM bill_of_materials AS pc
INNER JOIN bom ON (pc.product_id_snr  = bom.product_id_jnr)
)
SELECT bom.* FROM bom
ORDER BY sort_seq ASC 

Suggested fix:
Please implement the SQL standard to be compatible with the other database vendors.
[2 Sep 2019 12:36] MySQL Verification Team
Hi Mr. Marston,

Thank you for your bug report.

However, this is not a bug. That is a request for a new feature that is not planned for 8.0, AFAIK. If you wish me to verify it as a feature request, I can do it, but it will not be implemented soon.
[2 Sep 2019 13:03] Anthony Marston
Please consider this as a feature request.
[2 Sep 2019 14:43] Guilhem Bichot
Hello. Some complete information:
- having a window function in the recursive part of a recursive CTE raises an error in MySQL
- according to the official SQL standard (2016) it is necessary to raise an error. I cannot quote the standard (as it's copyright-ed material) but it's clear: after the SELECT word there shouldn't be any window function.
- so MySQL is compliant, and so this is not a bug.
- PostgreSQL accepts this query; and I can believe that some other databases systems do, too; it means they have an extension on top of the standard.
- But note, they have their own limitations too; for example:
https://docs.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-s...
(no subquery, no GROUP BY, no LEFT JOIN, etc; MySQL has some of these limitations too but not all)
- and there are differences in how they handle window functions in recursive CTEs:
https://blog.adamfurmanek.pl/2019/07/13/windowing-functions-in-recursive-cte/
(same query gives different results).
- I think there are reasons why the standard bans window functions in recursive CTEs: by doing this, it allows to have different algorithms:
  * first algorithm can be:
    for each row made at iteration N:
      take such row, from it, with a JOIN, do derive rows for iteration N+1, 
  * a second algorithm can be:
    take all rows made at iteration N together, from them, with a single JOIN, do derive all rows for iteration N+1.
If a DBMS uses the first or second algorithm, the window of rows will be different (many small windows in 1st, one big window in 2nd), and so the value of window functions will be different too.

Good news: it is possible to turn the query into a standard-compliant one. IIUC what you want is to build a "path string" like: '1/3/2/4' where each number means a level in the sub-tree (this is the 4th child of the 2nd child of the 3rd child of the 1st child of the root (root==SNAFU)). And numbers must be assigned in alphabetic order of siblings. The solution is to generate numbers before doing the recursive CTE; so I make a first CTE (a non-recursive one) with a PARTITION BY clause, which, for each item, gets all its siblings, and gives the item its ordering number among siblings. And I use this number in the recursive CTE:

WITH RECURSIVE 
bill_of_m_with_nr AS (SELECT *, ROW_NUMBER() OVER (PARTITION BY pc.product_id_snr ORDER BY pc.product_id_snr ASC, pc.product_id_jnr ASC) AS child_nr FROM bill_of_materials pc),
bom (sort_seq, level, product_id_snr, product_id_jnr, quantity)
AS ( SELECT CAST(CONCAT('/', LPAD(pc.child_nr, 4, '0')) AS char(4000)) AS sort_seq
, 1 AS level
, pc.product_id_snr, pc.product_id_jnr, pc.quantity
FROM bill_of_m_with_nr AS pc
WHERE product_id_snr='SNAFU'
  UNION ALL
  SELECT CONCAT(bom.sort_seq, '/', LPAD(pc.child_nr, 4, '0')) AS sort_seq
, level+1, pc.product_id_snr, pc.product_id_jnr, pc.quantity
FROM bill_of_m_with_nr AS pc
INNER JOIN bom ON (pc.product_id_snr  = bom.product_id_jnr)
)
SELECT bom.* FROM bom
ORDER BY sort_seq ASC;

MySQL yields:
+-----------------+-------+----------------+----------------+-------------+
| sort_seq        | level | product_id_snr | product_id_jnr | quantity    |
+-----------------+-------+----------------+----------------+-------------+
| /0001           |     1 | SNAFU          | B-2A_ASSY      | 1.000000000 |
| /0001/0001      |     2 | B-2A_ASSY      | TMJ            | 1.000000000 |
| /0001/0001/0001 |     3 | TMJ            | TMJ_CP         | 1.000000000 |
| /0001/0001/0002 |     3 | TMJ            | TMJ_WAFER      | 1.000000000 |
| /0001/0002      |     2 | B-2A_ASSY      | TMJ_ASSY_B     | 1.000000000 |
| /0002           |     1 | SNAFU          | TMJ_TEST       | 1.000000000 |
+-----------------+-------+----------------+----------------+-------------+
This is the exact same result as what PG gives for your original query :-)

Hope this helps.
[3 Sep 2019 13:17] Anthony Marston
When you say you cannot quote from the SQL standard because it is copyrighted you are wrong. You CAN quote small sections but you are not allowed to publish the entire document.

While it is true that other DBMS vendors sometimes add their own extensions I do NOT believe that the authors of Postgres, Oracle and SQL Server all created exactly the same extension at exactly the same time. I believe that they implemented a correct interpretation of the standard while you did not. If you look at https://docs.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-s... you will see a list a list of items that are not allowed in the recursive part of a CTE, and windows expressions such as ROW_NUMBER() do not appear in that list. I challenge you to quote from any part of the SQL standard which says otherwise.

I have tried your alternative query, and although it works I find it terribly clunky and not intuitive at all. It is one of those things which should be published in your documentation so that others can achieve the results that they expect.
[3 Sep 2019 13:29] MySQL Verification Team
Hi,

I am closing this report as "Not a bug".

As my colleague Guilhem has stated, a type of query that you are asking us to implement goes against current SQL standard. 

Also, since we are not allowed to quote from the SQL standard, for copyright reason, we can not meet your challenge.

Not a bug.