Bug #1527 Insert update performance hit and dead lock with FULLTEXT indexes
Submitted: 10 Oct 2003 16:17 Modified: 15 Nov 2003 10:05
Reporter: Dathan Pattishall Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S2 (Serious)
Version:3.23.57 - 4.0.15 OS:Linux (Linux ef248.friendfinderinc.com)
Assigned to: Assigned Account CPU Architecture:Any

[10 Oct 2003 16:17] Dathan Pattishall
Description:
Here is a email thread started on lists@mysql.com

---ORIGINAL MESSAGE---

 have to drop the FULLTEXT INDEX added to a set of dbs containing a TEXT field with an average of 6K of text added to each row, at a growth of 250K rows added a day to a set of 100 tables. The reason for the drop is due to table corruption and non-related query slowdown.
 
DELETES and Frequent UPDATES of a FULLTEXT INDEX column, IMO are the root cause of table corruption and query slow down time. 
 
The reason for this opinion is based on the situation where after adding the FULL TEXT INDEX to the body of each user's message, table corruption occurred everyday, and query time of non related queries to a full text column multiplied by 10 or more. The application made frequents updates to a table and a batch job or a user action would cause deletes.
 
Here are the corruption errors.
Can't open file: 'messages_06.MYI'. (errno: 145)

The fix
REPAIR TABLE messages_06;
 
I would see dead lock queries as well: Explanation below.
 
Prior to the FULL TEXT INDEX queries normally took no more then a second. These same queries took nearly 5 min before I killed them. The stuck state was either waiting for table (btw no alter/flush change was going on) or Locked, but not locked on any particular query. The second granularity that "show full processlist" displays did not indicate which query was the problem. 
 
 
If mysql.com is interested in this issue, I can send a copy the table format etc. In the meantime I have to drop the FULL TEXT INDEX.
 
 
 
 
- Dathan Vance Pattishall
  - Sr. Programmer and mySQL DBA for FriendFinder Inc.
  - http://friendfinder.com/go/p40688

-->-----Original Message-----

-->From: Sergei Golubchik [mailto:serg@mysql.com]

-->Sent: Thursday, October 09, 2003 3:05 AM

-->To: Dathan Vance Pattishall

-->Cc: mysql@lists.mysql.com

-->Subject: Re: 4.0.15 FULLTEXT INDEX

-->

-->Hi!

-->

-->On Oct 07, Dathan Vance Pattishall wrote:

-->> I have to drop the FULLTEXT INDEX added to a set of dbs containing a

-->> TEXT field with an average of 6K of text added to each row, at a growth

-->> of 250K rows added a day to a set of 100 tables. The reason for the

-->drop

-->> is due to table corruption and non-related query slowdown.

-->>

-->> If mysql.com is interested in this issue, I can send a copy the table

-->> format etc. In the meantime I have to drop the FULL TEXT INDEX.

 

 

-->> DELETES and Frequent UPDATES of a FULLTEXT INDEX column, IMO are the

-->> root cause of table corruption and query slow down time.

-->

-->Can you create a test case for this ?

 

I rolled back the change. I would have to re-implement the full text index, to produce then environment.

 

I can give you an indication to the traffic pattern, yet I cannot duplicate the environment right now, our users are adversely affected.

 

 

-->

-->> The reason for this opinion is based on the situation where after

-->adding

-->> the FULL TEXT INDEX to the body of each user's message, table

-->corruption

-->> occurred everyday, and query time of non related queries to a full text

-->

-->

-->What does REPAIR say ?

 

 

Status OK

 

-->

-->> I would see dead lock queries as well: Explanation below.

-->>

-->> Prior to the FULL TEXT INDEX queries normally took no more then a

-->> second. These same queries took nearly 5 min before I killed them.

-->

-->What queries ?

 

 

select id, from_handle, sent, is_read, is_replied, highlight, from_pwsid, compatibility, subject FROM messages_47 WHERE to_pwsid='20188347_88351' AND folder='inbox' ORDER by sent DESC LIMIT 0,50

 

 

The column known as body had the full text index and as you can tell in the above query there is nothing used that is related to body.

 

Here is a typical table structure.

 

 

+---------------+--------------+------+-----+---------+----------------+

| Field         | Type         | Null | Key | Default | Extra          |

+---------------+--------------+------+-----+---------+----------------+

| id            | int(11)      |      | PRI | NULL    | auto_increment |

| to_handle     | varchar(25)  | YES  | MUL | NULL    |                |

| to_pwsid      | varchar(25)  |      | PRI |         |                |

| from_handle   | varchar(25)  | YES  | MUL | NULL    |                |

| from_pwsid    | varchar(25)  | YES  | MUL | NULL    |                |

| folder        | varchar(25)  | YES  | MUL | NULL    |                |

| is_read       | tinyint(1)   | YES  |     | NULL    |                |

| is_replied    | tinyint(1)   | YES  |     | NULL    |                |

| size          | decimal(8,1) | YES  |     | NULL    |                |

| attach1       | int(11)      | YES  |     | NULL    |                |

| attach2       | int(11)      | YES  |     | NULL    |                |

| attach3       | int(11)      | YES  |     | NULL    |                |

| attach_name1  | varchar(128) | YES  |     | NULL    |                |

| attach_name2  | varchar(128) | YES  |     | NULL    |                |

| attach_name3  | varchar(128) | YES  |     | NULL    |                |

| sent          | datetime     | YES  | MUL | NULL    |                |

| migrated      | int(11)      |      | MUL | 0       |                |

| subject       | varchar(128) | YES  |     | NULL    |                |

| from_ip       | varchar(16)  | YES  | MUL | NULL    |                |

| body          | text         |      |     |         |                |

| highlight     | tinyint(10)  |      |     | 0       |                |

| compatibility | smallint(10) |      |     | 0       |                |

+---------------+--------------+------+-----+---------+----------------+

 

-->

-->> The stuck state was either waiting for table (btw no alter/flush

-->> change was going on) or Locked, but not locked on any particular

-->> query. The second granularity that "show full processlist" displays

-->> did not indicate which query was the problem.

-->

-->What does SHOW PROCESSLIST show ?

 

Selects or inserts taking greater then 5 min prior to death/completion. Death is caused by mysqladmin kill $id

How to repeat:
Here is a the table format

Make 100 of these tables in the same db 00-100

CREATE TABLE `messages_00` (
  `id` int(11) NOT NULL auto_increment,
  `to_handle` varchar(25) default NULL,
  `to_pwsid` varchar(25) NOT NULL default '',
  `from_handle` varchar(25) default NULL,
  `from_pwsid` varchar(25) default NULL,
  `folder` varchar(25) default NULL,
  `is_read` tinyint(1) default NULL,
  `is_replied` tinyint(1) default NULL,
  `size` decimal(8,1) default NULL,
  `attach1` int(11) default NULL,
  `attach2` int(11) default NULL,
  `attach3` int(11) default NULL,
  `attach_name1` varchar(128) default NULL,
  `attach_name2` varchar(128) default NULL,
  `attach_name3` varchar(128) default NULL,
  `sent` datetime default NULL,
  `migrated` int(11) NOT NULL default '0',
  `subject` varchar(128) default NULL,
  `from_ip` varchar(16) default NULL,
  `body` text NOT NULL,
  `highlight` tinyint(10) NOT NULL default '0',
  `compatibility` smallint(10) NOT NULL default '0',
  PRIMARY KEY  (`id`,`to_pwsid`),
  KEY `th` (`to_handle`),
  KEY `tpid` (`to_pwsid`),
  KEY `fh` (`from_handle`),
  KEY `fpid` (`from_pwsid`),
  KEY `sent` (`sent`),
  KEY `mig` (`migrated`),
  KEY `folder` (`folder`),
  KEY `from_ip` (`from_ip`),
  KEY `id_fol_read` (`to_pwsid`,`folder`,`is_read`),
  FULLTEXT 'body' (body) /* added this from the cut and paste */
) TYPE=MyISAM

Insert over 130K rows,

continuously insert rows, remove rows, select a range of rows and at the end of 72 hours the table should have nearly reseted itself. 

This is an idea on the number of delets / inserts that occur.

What you will hit are dead locks displaying status fields of 
waiting for table status 
Locked
sorting result on all queries

I can attach some rrd files for you to see a before and after of the system if need be.
[12 Oct 2003 12:18] MySQL Verification Team
Sergei I put here verified in fault of another suitable status option.
Please continue your analyze/discussion you began at general list.
[14 Oct 2003 6:22] Sergei Golubchik
After you got corrupted table, try CHECK TABLE - what does it say ?

About slowdown - technically, there could be a slowdown on unrelated queries, if you perform many FULLTEXT searches. As fulltext index uses key cache more intensively than normal indexes, it can be that your keycache is full and fulltext searches are constantly removing key pages of normal indexes from the chache. Thus queries that use normal indexes run, almost without key cache - and it's slow.

See SHOW STATUS for values of Key_blocks_used, Key_read_requests, Key_reads, Key_write_requests, Key_writes.
[24 Oct 2003 15:43] Dathan Pattishall
So DeadLock queries or queries that run for a long period of time are related to the key cache being full or unavailable? So, then there is a double gotcha or draw back: Frequent updates to a FULLTEXT INDEX will slow the entire system down? From what I have read about the key_cache updates are slowed down by the nature of this cache.

So optimizing the key cache could be a double win for the system as a hole and this new type of index. 

Or could there be a separate key cache definition for FULLTEXT INDEXes to use?
Imagine you have a 4GB system. Have a 1.5 GB key_cache for every other column EXCEPT FULL TEXT and 1.5 GB ftkey_cache for this column type.
[27 Oct 2003 3:10] Sergei Golubchik
About key cache - it was only a guess, you need to check the values of Key_blocks_used, Key_read_requests, Key_reads, Key_write_requests, Key_writes in the SHOW STATUS know if this guess is applicable in your case.

About multiple key caches - we have this feature as well as other improvements in the key cache code in 4.1.1. If keycache is really the bottleneck in your case, then 4.1.1 may help