Bug #74421 Missing "SIMILAR TO" support
Submitted: 16 Oct 2014 19:33 Modified: 17 Oct 2014 12:19
Reporter: Yordan Gigov Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server Severity:S4 (Feature request)
Version: OS:Any
Assigned to: CPU Architecture:Any
Tags: feature, predicate, standard compliance

[16 Oct 2014 19:33] Yordan Gigov
Description:
I was writing an advanced generic query builder in PHP when I learned this standard predicate is still unsupported in MySQL, even though it has POSIX regular expressions. I have written a workaround that wraps it for REGEXP, though it is untested, as I still have a bit more to add to other parts.

How to repeat:
Feature request. Not a bug. Why is this field required?

Suggested fix:
I use the following PHP code as a converter/wrapper. You can do the same internally and just pass it on to the REGEXP handler. The names of the variables should be obvious enough as to not need explaining.

// escape dots
$strarr = str_split($value);
$lastchar = count($strarr)-1;
if(! $lastchar) {

}
if($strarr[$lastchar] === '&'){
    // Escape last character
    // Fast faster than using function calls
    $strarr[$lastchar] = '\\';
    $strarr[] = '&';
}
if($strarr[0] === '^'){
    // Escape first character
    // no fast way to do it as above
    $strarr = array_splice($strarr, 0, 0, '\\');
}
$target = implode('',$strarr);

if($esc_char !== NULL && $esc_char !== '\\'){
	$target = str_replace ( $esc_char , '\\' , $target);
}
elseif ($esc_char === NULL){
	$target = str_replace('\\','\\\\',$target);
}
// else escape character is backslash
// escape dots
$target = str_replace ( '.' , "\\." , $target);
// replace _ with .
$target = str_replace ( '_' , '.' , $target);
// replace % with .*
// this one is more complex, because we have to know if it's escaped
$pos = 0;
while($pos = strpos($target,'%',$pos)){
    $escape_count = 0;
    while(strpos($str,($pos-$escape_count-1)) === '\\'){
	$escape_count++;
    }
    if(!($escape_count & 1)){
	//if its even number, then our target must be replaced
	$target = substr_replace($target,"\\.\\*",$pos,1);
	//no need to update $pos It won't find the character here again
    }
    else{
	//we have to remove the escape character before it
	$target = substr_replace($target,'%',$pos-1,2);
    }
}

// Finalize, because SIMILAR TO must match the whole string
$target = '^' . $target . '&';
[17 Oct 2014 12:19] Yordan Gigov
That block "if(! $lastchar)" is actually supposed to be more like

if($lastchar <0) {
    // empty string, do no replacing
return $value;
}