Bug #105159 rlike expression evaluated when it should not be
Submitted: 7 Oct 2021 0:33 Modified: 8 Oct 2021 6:23
Reporter: Dave Pullin (Basic Quality Contributor) Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:5.7.31, 5.7.35 OS:CentOS (CentOS Linux release 7.6.1810 (Core))
Assigned to: CPU Architecture:Any

[7 Oct 2021 0:33] Dave Pullin
Description:
the right argument of RLIKE is evaluated (and may fail) when guarded by an IF  expression that should prevent evaluation.

select  if( false, 'x' rlike substring("z",2),true);
select  if( false, 'x' rlike "",true);

both fail with "Got error "empty (sub)expression' from regexp"

because (apparently) MySQL is converting the argument into a REGEX before it determines that the 'then' clause of the IF should not be evaluated.

The same is true of a case expression, and ifnull().

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.7.31    |
+-----------+
1 row in set (0.00 sec)

mysql> select  if( false, 'x' rlike substring("z",2),true);
ERROR 1139 (42000): Got error 'empty (sub)expression' from regexp
mysql> select  if( false, 'x' rlike "",true);
ERROR 1139 (42000): Got error 'empty (sub)expression' from regexp

How to repeat:
select  if( false, 'x' rlike substring("z",2),true);
select  if( false, 'x' rlike "",true);
select  ifnull("notnull", 'x' rlike "");

Suggested fix:
The conversion into a regex (or at least the throwing of an error) should not occur unless and until it is determined that the RLIKE should be evaluated.

My workaround: make sure the right argument of RLIKE evaluates to a valid regex in all circumstances:

select   'x' rlike if( length("z")<2,'.*',substring("z",2));
[7 Oct 2021 4:51] MySQL Verification Team
Hello Dave,

Thank you for the report and test case.
Verified as described.

regards,
Umesh
[8 Oct 2021 6:23] Roy Lyseng
Posted by developer:
 
What happens here is that we have constant expressions that fail when evaluated.
When the IF or IFNULL expression is analyzed, we look into each of the subexpressions
to see if they can be simplified, and constant expressions often can.
Thus, this is a natural way of processing these statements.
Workarounds:
- Make sure that the constant expression is valid.
- Make sure that the "impossible" subexpression is not constant (e.g. it refers a column).