Bug #7531 performance bug in full text search
Submitted: 24 Dec 2004 17:31 Modified: 25 Jan 2005 14:41
Reporter: Sergey Naletov Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:4.1.8 OS:FreeBSD (FreeBSD 5.2.1)
Assigned to: Assigned Account CPU Architecture:Any

[24 Dec 2004 17:31] Sergey Naletov
Description:
full text search works very slowly if I am using it in table joins.
Examples:
mysql> select t.id from ss_t t, ss_g g  where  t.gal_id=g.id and  match(g.keywords) against('+(doctor mysql world girl)' in boolean mode) order by  rand() limit 1;
+------+
| id   |
+------+
| 5300 |
+------+
1 row in set (0.50 sec)

Now the same query, but using regexp:

mysql> select t.id from ss_t t, ss_g g  where  t.gal_id=g.id and  g.keywords regexp 'doctor|mysql|world|girl' order by  rand() limit 1;
+------+
| id   |
+------+
| 6118 |
+------+
1 row in set (0.10 sec)

5 times faster! If I will add more tables to inner join full text search will lasts almost forever and mysqld will eat all CPU time. In this case I see in processlist this query in state 'copying into temp tables'.

And performance examples without table joins:
using regexp:
mysql> select id from ss_g where keywords regexp 'doctor|mysql|world|girl' order by rand() limit 1;
+-------+
| id    |
+-------+
| 43865 |
+-------+
1 row in set (1.65 sec)

and using full text search:
mysql> select id from ss_g where match(keywords) against ('+(doctor mysql world girl)' in boolean mode) order by rand() limit 1;
+--------+
| id     |
+--------+
| 131449 |
+--------+
1 row in set (0.25 sec)

so, here we see reverse picture - full text search faster then regexp.
So, the problem is in table joins.

How to repeat:
-

Suggested fix:
-
[24 Dec 2004 17:39] Sergey Naletov
forgot to add additional info about installed mysqld:

>C compiler:    gcc (GCC) 3.3.3 [FreeBSD] 20031106
>C++ compiler:  g++ (GCC) 3.3.3 [FreeBSD] 20031106
Some paths:  /usr/bin/perl /usr/bin/make /usr/local/bin/gmake /usr/bin/gcc /usr/bin/cc
GCC: Using built-in specs.
Configured with: FreeBSD/i386 system compiler
Thread model: posix
gcc version 3.3.3 [FreeBSD] 20031106
Compilation info: CC='gcc'  CFLAGS=''  CXX='g++'  CXXFLAGS=''  LDFLAGS=''  ASFLAGS=''
LIBC: 
-r--r--r--  1 root  wheel  857572 Oct 28 21:30 /lib/libc.so.5
-r--r--r--  1 root  wheel  1690422 Oct 28 21:30 /usr/lib/libc.a
lrwxr-xr-x  1 root  wheel  14 Oct 28 21:30 /usr/lib/libc.so -> /lib/libc.so.5
Configure command: ./configure '--with-tcp-port=3308' '--with-unix-socket-path=/tmp/mysql-4.1.7.sock' '--prefix=/usr/local/mysql-4.1.7' '--without-innodb' '--without-man' '--without-docs' '--enable-assembler' '--enable-thread-safe-client' '--with-charset=cp1251' '--with-extra-charsets=complex'
[25 Dec 2004 13:02] Hartmut Holzgraefe
can you please add EXPLAIN output for the statements mentioned in this report?
[25 Dec 2004 14:41] Sergey Naletov
Sure. Here is explains for all queries in order as they have been mentioned above:

mysql> explain select t.id from ss_t t, ss_g g  where  t.gal_id=g.id and  match(g.keywords) against('+(doctor mysql world girl)' in boolean mode) order by  rand() limit 1;                                                                
+----+-------------+-------+----------+------------------+----------+---------+----------+------+----------------------------------------------+
| id | select_type | table | type     | possible_keys    | key      | key_len | ref      | rows | Extra                                        |
+----+-------------+-------+----------+------------------+----------+---------+----------+------+----------------------------------------------+
|  1 | SIMPLE      | g     | fulltext | PRIMARY,keywords | keywords |       0 |          |    1 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | t     | ref      | gal_id           | gal_id   |       5 | sst.g.id |   16 | Using where                                  |
+----+-------------+-------+----------+------------------+----------+---------+----------+------+----------------------------------------------+

mysql> explain select t.id from ss_t t, ss_g g  where  t.gal_id=g.id and  g.keywords
regexp 'doctor|mysql|world|girl' order by  rand() limit 1;

+----+-------------+-------+--------+---------------+---------+---------+--------------+------+---------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref          | rows | Extra                           |
+----+-------------+-------+--------+---------------+---------+---------+--------------+------+---------------------------------+
|  1 | SIMPLE      | t     | ALL    | gal_id        | NULL    		|    NULL 	| NULL         | 1924		| Using temporary; Using filesort |
|  1 | SIMPLE      | g    | eq_ref| PRIMARY | PRIMARY 	|       4 		| sst.t.gal_id |    1 		| Using where                     |
+----+-------------+-------+--------+---------------+---------+---------+--------------+------+---------------------------------+

mysql> explain select id from ss_g where keywords regexp 'doctor|mysql|world|girl' order
by rand() limit 1;

+----+-------------+--------------+------+---------------+------+---------+------+-------+----------------------------------------------+
| id | select_type | table        | type | possible_keys | key  | key_len | ref  | rows  | Extra                                        |
+----+-------------+--------------+------+---------------+------+---------+------+-------+----------------------------------------------+
|  1 | SIMPLE      | ss_g 		| ALL  | NULL          | NULL |    NULL | NULL | 97900 | Using where; Using temporary; Using filesort |
+----+-------------+--------------+------+---------------+------+---------+------+-------+----------------------------------------------+

mysql> explain select id from ss_g where match(keywords) against ('+(doctor mysql world
girl)' in boolean mode) order by rand() limit 1;

+----+-------------+--------------+----------+---------------+----------+---------+------+------+----------------------------------------------+
| id | select_type | table        | type     | possible_keys | key      | key_len | ref  | rows | Extra                                        |
+----+-------------+--------------+----------+---------------+----------+---------+------+------+----------------------------------------------+
|  1 | SIMPLE      | ss_g		 | fulltext | keywords      | keywords |       0 |      |    1 | Using where; Using temporary; Using filesort |
+----+-------------+--------------+----------+---------------+----------+---------+------+------+----------------------------------------------+
[14 Feb 2005 22:54] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".