Bug #5452 fulltext boolean is disagree with boolean logic
Submitted: 7 Sep 2004 17:01 Modified: 7 Sep 2004 18:06
Reporter: ladimir Kolpakov Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: FULLTEXT search Severity:S4 (Feature request)
Version:4.0.20 OS:Linux (linux)
Assigned to: CPU Architecture:Any
Triage: Triaged: D5 (Feature request)

[7 Sep 2004 17:01] ladimir Kolpakov
Description:
Boolean fulltext result is logically incorrect when
only negative filters supplied,
and none of positive filters applied.
(see users comments on 
http://dev.mysql.com/doc/mysql/en/Fulltext_Boolean.html)

To correct that,

  MATCH (txt) AGAINST ('+want -wont' IN BOOLEAN MODE)

should be internally processed as

   (MATCH (txt) AGAINST ('+want' IN BOOLEAN MODE))
AND NOT  
  (MATCH (txt) AGAINST ('+wont' IN BOOLEAN MODE))

Otherwise,
there's no way for user to say "I know exactly what I won't" 
in search phrase (which is very common behavior in narrowing search).

Application developer have to work around such logical error
on "dynamic SQL preparation" step,
however it means creation of yet another boolean fulltext query parser,
with all negative consequences, like maintenance problem and so on...

How to repeat:
#!/bin/bash
#   $Id$
#   Bug test for
#   "Full-text boolean fails on "-word""
#
# See also comment of "Christopher Tarnas on April 16 2004 1:18pm"
# on http://dev.mysql.com/doc/mysql/en/Fulltext_Boolean.html
#
#--w 08/2004############################################################
 
mysql="mysql -hlocalhost -utest -ptest -tvvv test"
 
########################################################################
function Usage {
  cat <<EoD
#-----------------------------------------------------------
  Usage:
    $0 [options]
  Options:
    --help       - print this info
    --create     - re-create test table
#-----------------------------------------------------------
EoD
}
function GetOptions {
  while (test -n "$*") ; do
    case "$1" in
    --create)
        #--- "update versus show" mode
        opt_create=1
        ;;
    --help)
        #--- print usage
        opt_help=1
        ;;
    *)
        #--- print usage
        opt_help=1
        ERROR="$ERROR unknown option '$1',"
        ;;
    esac
 
    shift
  done
}
########################################################################
function CREATE_TABLE() {
    cat <<EoD
DROP   TABLE IF EXISTS t1;
 
CREATE TABLE t1 (
    txt VARCHAR(255)
   ,FULLTEXT INDEX ftidx (txt)
) TYPE = MyISAM;
 
INSERT INTO t1 (txt) VALUES
    ('looking for unix and windows both')
   ,('looking for unix only')
   ,('looking for windows only')
   ,('none of os-es here')
;
EoD
}
########################################################################
function RUN_TESTS() {
    cat <<EoD
SELECT *
  FROM t1
 WHERE (MATCH (txt) AGAINST ('-windows' IN BOOLEAN MODE))
/* wrong */ ;
SELECT *
  FROM t1
 WHERE NOT (MATCH (txt) AGAINST ('windows' IN BOOLEAN MODE))
/* correct */ ;
SELECT *
  FROM t1
 WHERE (MATCH (txt) AGAINST ('windows' IN BOOLEAN MODE))
/* correct */ ;
SELECT *
  FROM t1
 WHERE (MATCH (txt) AGAINST ('unix' IN BOOLEAN MODE))
/* correct */ ;
SELECT *
  FROM t1
 WHERE (MATCH (txt) AGAINST ('+unix -windows' IN BOOLEAN MODE))
/* correct */ ;
SELECT *
  FROM t1
 WHERE     (MATCH (txt) AGAINST ('unix' IN BOOLEAN MODE))
   AND NOT (MATCH (txt) AGAINST ('windows' IN BOOLEAN MODE))
/* correct */ ;
EoD
}
########################################################################
GetOptions $*
if [ -n "$opt_help" ] ; then
  Usage
  if [ -n "$ERROR" ] ; then
    echo ERROR: $ERROR bye.
    exit 1
  fi
  exit 0
fi
#-----------------------------------------------------------
 
[ -n "$opt_create" ] && CREATE_TABLE | $mysql
 
RUN_TESTS | $mysql
 
########################################################################
#   $Log$
 

Suggested fix:
MATCH() filters should be separated internally on 2 groups:
positive and negative ones, and then joined by AND.
[7 Sep 2004 18:06] Sergei Golubchik
I'm changing this into feature request.

Current behaviour is not a bug but a design issue - fulltext search works by buulding list of words that match and removing those that are excluded by NOT.

I do plan to make bollean search with only negative filters to work, but it's definitely doesn't qualify as a bugfix :)