Bug #110438 Problems with using the right outer join and bitcount functions simultaneously
Submitted: 20 Mar 2023 16:01 Modified: 27 Mar 2023 16:12
Reporter: Chenglin Tian Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:8.0.29 OS:Ubuntu
Assigned to: CPU Architecture:x86

[20 Mar 2023 16:01] Chenglin Tian
Description:
When I use the right outer join and bit count functions, as well as the not operator, to perform a combined query, the following problems occur:

SELECT ALL t2.c0 AS ref0 FROM t1,  t2 RIGHT  OUTER JOIN t0 ON  EXISTS (SELECT 1 wHERE FALSE) WHERE (NOT ((t2.c0 IS FALSE) NOT IN (BIT_COUNT(t1.c0)))) ;

We hope to get four pieces of data, but the actual return result is an empty set

How to repeat:
Table building statement:
CREATE TABLE IF NOT EXISTS t0(c0 LONGTEXT   STORAGE DISK COMMENT 'asdf'  COLUMN_FORMAT FIXED) ;
CREATE TABLE IF NOT EXISTS t1 LIKE t0;
CREATE TABLE IF NOT EXISTS t2(c0 DECIMAL ZEROFILL  COMMENT 'asdf'  COLUMN_FORMAT FIXED PRIMARY KEY UNIQUE STORAGE DISK) ;
select TABLE_NAME, ENGINE from information_schema.TABLES where table_schema = 'database10';
REPLACE INTO t0(c0) VALUES('j_{(~PAZ)');
TRUNCATE TABLE t0;
INSERT DELAYED INTO t0(c0) VALUES(NULL);
CHECK TABLE t2 ;
INSERT INTO t1(c0) VALUES('+纷O');
ALTER TABLE t0 STATS_PERSISTENT 0, COMPRESSION 'LZ4';
select TABLE_NAME, ENGINE from information_schema.TABLES where table_schema = 'database10';
REPLACE INTO t0(c0) VALUES(NULL);
select TABLE_NAME, ENGINE from information_schema.TABLES where table_schema = 'database10';
INSERT IGNORE INTO t0(c0) VALUES('');
select TABLE_NAME, ENGINE from information_schema.TABLES where table_schema = 'database10';
ALTER TABLE t1 COMPRESSION 'LZ4', INSERT_METHOD LAST, STATS_PERSISTENT DEFAULT, CHECKSUM 1;
REPLACE INTO t0(c0) VALUES(1373444939);
DELETE LOW_PRIORITY IGNORE FROM t0 WHERE t0.c0;
ALTER TABLE t1 ROW_FORMAT REDUNDANT, ALGORITHM DEFAULT, FORCE, INSERT_METHOD FIRST, ENABLE KEYS, COMPRESSION 'ZLIB', DELAY_KEY_WRITE 0;
INSERT INTO t1(c0) VALUES(1257022825), (365137223), (NULL);
CREATE INDEX i0 ON t2(c0 DESC);
select TABLE_NAME, ENGINE from information_schema.TABLES where table_schema = 'database10';
INSERT_METHOD NO, RENAME AS t0, PACK_KEYS DEFAULT, DISABLE KEYS;
ALTER TABLE t0 ;
REPLACE LOW_PRIORITY INTO t1(c0) VALUES(-1836894879);
REPLACE DELAYED INTO t0(c0) VALUES(NULL);
REPLACE INTO t1(c0) VALUES(NULL);
REPLACE DELAYED INTO t2(c0) VALUES(960364164);
REPLACE LOW_PRIORITY INTO t1(c0) VALUES("");
INSERT IGNORE INTO t0(c0) VALUES(1318946640);
SET SESSION rbr_exec_mode = IDEMPOTENT;
DELETE QUICK IGNORE FROM t0;
REPLACE LOW_PRIORITY INTO t1(c0) VALUES(NULL);
SET SESSION myisam_stats_method = nulls_ignored;
INSERT IGNORE INTO t0(c0) VALUES(NULL);
INSERT LOW_PRIORITY IGNORE INTO t0(c0) VALUES(2066070913);

Trigger statement:
SELECT ALL t2.c0 AS ref0 FROM t1,  t2 RIGHT  OUTER JOIN t0 ON  EXISTS (SELECT 1 wHERE FALSE) WHERE (NOT ((t2.c0 IS FALSE) NOT IN (BIT_COUNT(t1.c0)))) ;
expected:[null,null,null,null]
actual:empty set

Suggested fix:
Perhaps this error is related to the right join operation. The optimization of the expression caused an exception in the result
[21 Mar 2023 13:48] MySQL Verification Team
Hi Mr. Tian,

Thank you for your bug report.

However, what you are reporting is based on MyISAM storage engine, which is deprecated.

Let us know if you can repeat this with InnoDB storage engine.

Unsupported.
[21 Mar 2023 16:37] Chenglin Tian
Hello, we have reorganized the problematic test cases

First of all, the database version we are using is 8.0.29, and the storage engine for this version is Innodb. We have confirmed that. The following are our simplified test cases. Some statements submitted previously have misled you, and we apologize for this.

CREATE TABLE IF NOT EXISTS t0(c0 LONGTEXT   STORAGE DISK COMMENT 'asdf' COLUMN_FORMAT FIXED) ;
CREATE TABLE IF NOT EXISTS t1 LIKE t0;
CREATE TABLE IF NOT EXISTS t2(c0 DECIMAL ZEROFILL  COMMENT 'asdf' COLUMN_FORMAT FIXED PRIMARY KEY UNIQUE STORAGE DISK) ;
REPLACE INTO t0(c0) VALUES('j_{(~PAZ)');
INSERT DELAYED INTO t0(c0) VALUES(NULL);
INSERT INTO t1(c0) VALUES('+纷O');
REPLACE INTO t0(c0) VALUES(NULL);
INSERT IGNORE INTO t0(c0) VALUES('');
REPLACE INTO t0(c0) VALUES(1373444939);
DELETE LOW_PRIORITY IGNORE FROM t0 WHERE t0.c0;
INSERT INTO t1(c0) VALUES(1257022825), (365137223), (NULL);
REPLACE LOW_PRIORITY INTO t1(c0) VALUES(-1836894879);
REPLACE DELAYED INTO t0(c0) VALUES(NULL);
REPLACE INTO t1(c0) VALUES(NULL);
REPLACE DELAYED INTO t2(c0) VALUES(960364164);
REPLACE LOW_PRIORITY INTO t1(c0) VALUES("");
INSERT IGNORE INTO t0(c0) VALUES(1318946640);
DELETE QUICK IGNORE FROM t0;
REPLACE LOW_PRIORITY INTO t1(c0) VALUES(NULL);
INSERT IGNORE INTO t0(c0) VALUES(NULL);
INSERT LOW_PRIORITY IGNORE INTO t0(c0) VALUES(2066070913);

--SELECT ALL t2.c0 AS ref0 FROM t1,  t2 RIGHT  OUTER JOIN t0 ON  EXISTS(SELECT 1 wHERE FALSE) WHERE (NOT ((t2.c0 IS FALSE) NOT IN (BIT_COUNT(t1.c0)))) ;
--[]

For this test case, the expected result is four pieces of data, and the content of the four pieces of data is null, but the actual result is an empty set.
Because the values of t2.c0 are all null, we replaced t2.c0 in the above query with null, and obtained the correct results.

--SELECT t1.c0,t2.c0 AS ref0 FROM t1,  t2 RIGHT  OUTER JOIN t0 ON  EXISTS (SELECT 1 wHERE FALSE) WHERE (NOT ((null IS FALSE) NOT IN (BIT_COUNT(t1.c0)))) ;
--[null,null,null,null]
[22 Mar 2023 13:32] MySQL Verification Team
Hi,

Actually, the correct result is an empty set.

That is because your query is not properly designed, nested queries are improperly designed, OUTER JOIN on nothing makes no sense, and so on and so forth. Next, many expressions make no sense, like the comparison of DECIMAL with BOOLEAN and so on and so forth .....

Unsupported.
[26 Mar 2023 9:33] Chenglin Tian
Hello, I have some doubts that I need you to answer,

First, regarding the issue of incorrect results. You believe that an empty set is the correct result, but we use the tlp method of sqlercer to verify it. Please take a look at the following two queries:

--SELECT ALL t2.c0 AS ref0 FROM t1, t2 RIGHT OUTER JOIN t0 ON EXISTS (SELECT 1 wHERE FALSE);

This query returns all 16 pieces of data. We add a condition to it and calculate whether it is true, false, or empty. Combining the three results should be the same as the results of the above query:

--SELECT ALL t2.c0 AS ref0 FROM t1, t2 RIGHT OUTER JOIN t0 ON EXISTS (SELECT 1 wHERE FALSE) WHERE ((t2.c0 IS FALSE) NOT IN (BIT_COUNT(t1.c0))) UNION ALL SELECT ALL t2.c0 AS ref0 FROM t1, t2 RIGHT OUTER JOIN t0 ON EXISTS(SELECT 1 wHERE FALSE) WHERE (NOT ((t2.c0 IS FALSE) NOT IN (BIT_COUNT(t1.c0)))) UNION ALL SELECT ALL t2.c0 AS ref0 FROM t1, t2 RIGHT OUTER JOIN t0 ON EXISTS(SELECT 1 wHERE FALSE) WHERE ((t2.c0 IS FALSE) NOT IN (BIT_COUNT(t1.c0))) is null;

The two results should be the same, but the following query only returned 12 results. Subsequently, we will disassemble the query and analyze it separately:

--SELECT ALL t2.c0 AS ref0 FROM t1, t2 RIGHT OUTER JOIN t0 ON EXISTS (SELECT 1 wHERE FALSE) WHERE ((t2.c0 IS FALSE) NOT IN (BIT_COUNT(t1.c0)));

The first query returned six pieces of data, and the results were analyzed to be correct.

--SELECT ALL t2.c0 AS ref0 FROM t1, t2 RIGHT OUTER JOIN t0 ON EXISTS(SELECT 1 wHERE FALSE) WHERE ((t2.c0 IS FALSE) NOT IN (BIT_COUNT(t1.c0))) is null;

The third query returned six pieces of data, and the result was also correct after analysis.

--SELECT ALL t2.c0 AS ref0 FROM t1, t2 RIGHT OUTER JOIN t0 ON EXISTS(SELECT 1 wHERE FALSE) WHERE (NOT ((t2.c0 IS FALSE) NOT IN (BIT_COUNT(t1.c0)))) ;

However, the second query returned an empty set. Based on the inference that the number should match before and after, we expect to obtain four pieces of data. If you said that this query expected to return an empty set, which of the three queries had a problem? If none of the three queries had a problem, why did the number of results from the basic query and subsequent queries differ?

Secondly, many operations on the query you mentioned are meaningless, such as join operations, comparisons between Boolean types and floating point numbers, and so on.
We discovered this problem during the execution of the business, and then simplified the query statements. Although some operations do seem meaningless, we did use these functions to query the results. Because this test case is syntactically correct and does not report errors, we expect its results to also meet expectations.
[27 Mar 2023 12:35] MySQL Verification Team
Hi Mr. Tian,

We see no reason why the results from those two queries should be the same ......
[27 Mar 2023 16:12] Chenglin Tian
Hi,The idea that the results obtained by these two queries should be the same stems from this paper: 
https://www.manuelrigger.at/preprints/TLP.pdf

The core idea of the method is that a predicate on a row or intermediate result must either evaluate to TRUE, FALSE, or NULL. Thus, an original query can be decomposed into three partitioning queries. One partitioning query fetches rows where a predicate p holds, one where it does not hold, and one for which it evaluates to NULL. That is, we construct one predicate p, one predicate NOT p, and one predicate p IS NULL.
The results of these two queries are consistent. The second query we constructed builds a condition based on the first query, and assigns three results, true, false, and null, to this condition. Combine the three results, and the final result should be the same as the original query result, because this condition will not produce results other than true, false, and null.
[28 Mar 2023 12:22] MySQL Verification Team
Hi,

Any expression involving NULL will return NULL as a result. Any query where filtering condition is : ... WHERE NULL, will return 0 (zero) rows.

Second query has IS NULL in the wrong place, which is why it does not return any rows.
[28 Mar 2023 13:27] Alex Wong
I think these two test cases should be semantically equal because of the paper above.

--SELECT ALL t2.c0 AS ref0 FROM t1, t2 RIGHT OUTER JOIN t0 ON EXISTS (SELECT 1 wHERE FALSE);

--SELECT ALL t2.c0 AS ref0 FROM t1, t2 RIGHT OUTER JOIN t0 ON EXISTS (SELECT 1 wHERE FALSE) WHERE ((t2.c0 IS FALSE) NOT IN (BIT_COUNT(t1.c0))) UNION ALL SELECT ALL t2.c0 AS ref0 FROM t1, t2 RIGHT OUTER JOIN t0 ON EXISTS(SELECT 1 wHERE FALSE) WHERE (NOT ((t2.c0 IS FALSE) NOT IN (BIT_COUNT(t1.c0)))) UNION ALL SELECT ALL t2.c0 AS ref0 FROM t1, t2 RIGHT OUTER JOIN t0 ON EXISTS(SELECT 1 wHERE FALSE) WHERE ((t2.c0 IS FALSE) NOT IN (BIT_COUNT(t1.c0))) is null;

these two test cases should return the same result, but actually they return different results.
[28 Mar 2023 13:40] MySQL Verification Team
Thank you , Mr. Tian,

However, we do not follow random papers, but strict SQL standards ....