MySQL Bugs: #3483: INSERT hangs indefinitely on FULLTEXT table

Bug #3483	INSERT hangs indefinitely on FULLTEXT table
Submitted:	16 Apr 2004 11:24	Modified:	16 Feb 2005 16:08
Reporter:	Don MacAskill	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: MyISAM storage engine	Severity:	S1 (Critical)
Version:	4.0.18-max-AMD64	OS:	Linux (Red Hat Enterprise 3)
Assigned to:	Sergei Golubchik	CPU Architecture:	Any

Description:
On a relatively small table (6500 rows, 4 columns, 442K mysqldump) with a FULLTEXT index across 3 of the columns, our DB hangs indefinitely on INSERTs.  Not every INSERT, I'm still trying to track down what exactly causes it (current hypothesis:  might only occur after some records have been DELETE'd).  Interestingly, we have another FULLTEXT table with more than 1,000,000 rows which hasn't ever hung on INSERT.  Finally, we have a FULLTEXT table with about 60,000 rows, and it hangs less frequently, but still does from time-to-time.

I have to actually shutdown MySQL, check the table, and restart.  Killing the thread doesn't work.  

The thread says it's in "Query | Update" state, until killed, then it sits at "Killed" forever. 

I tried wrapping the INSERT between "LOCK TABLE" and "UNLOCK TABLES" but that didn't do the trick.  LOW_PRIORITY also didn't do anything.

Finally, I checked what was happening on our Slave.  It's passive, so no clients are connected to it, no SELECTS are being done.  Only the two slave threads and my connection were active, and that MySQL server hung as well.  So it's not some strange race condition between SELECTs and INSERTs, since there aren't any SELECTs going on.

UPDATEs and DELETEs do not cause the hang.

After posting on the list, another user reported the same problem.

Finally, adding "skip-concurrent-insert" to MySQL's startup completely fixed the problem.

How to repeat:
I can get this to happen very frequently on my installation, but I haven't yet discovered completely how to repeat it.  It's very rare that a single INSERT on a fresh MySQL start (after repairing the table) causes the problem.

But 99% of the time when there's roughly 30-50 UPDATE/DELETE/INSERT statements, it occurs.  It could be in my head, but it seems to occur much more frequently after the server has been running for awhile.  Perhaps it needs to have a MATCH run against the data first or something.

Suggested fix:
"skip-concurrent-insert" fixes the problem, but of course, isn't ideal.  As for a real fix, I'm not sure without looking at the code.

Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

It's already fixed in 4.0.19 - try it when it will be released.

Changelog comment:

   * Fixed a bug in `MATCH ... AGAINST()' searches when another thread
     was doing concurrent inserts into the `MyISAM' table in question.
     The first -- full-text search -- query could return incorrect
     results in this case (e.g. "phantom" rows or not all matching
     rows, even an empty result set).  The easiest way to check whether
     you are affected is to start `mysqld' with
     `--skip-concurrent-insert' switch and see if it helps.

I had seen that fix, however, it didn't appear to match up exactly.  I'm seeing the problem on a server on which there are zero SELECT statement, much less any MATCH ... AGAINST statements.  

(It's a slave that no clients connect to, so it only gets the INSERT, UPDATE, and DELETE statements).

It's very possible that this fix does cover this problem, but it didn't sound like the symptoms perfectly align.

Re-reading my earlier comment, it wasn't as clear as it could have been.

The UPDATE and DELETE statements that are being passed to the slave do not have any MATCH ... AGAINST elements in the WHERE clause.  They're simply updating and deleting based on the primary key.  

So there aren't any SELECT, UPDATE, or DELETEs being used on my slave with MATCH ... AGAINST, but I still see this problem.

I'm having the same problem on Red Hat Enterprise Linux WS release 3 (Taroon), using 4.0.20-Max on a dual processor AMD64 system.

The table is a few short varchars, and one MEDIUMBLOB field.  The problem occurs with only inserts, and occurs much more frequently when when selects are being run on the same table.

Any word on this?

I run mysql 4.1.5 on an almost idle dual amd64, and also see commands in 'Locked' state. The same happened with 4.1.4. I try the '--skip-concurrent-insert' now and see if that eliminates the problem.

Is this a mysql-on-(dual)amd64 problem?

OS: Debian 64bit gcc-3.4
kernel: 2.6.8.1-mm4
mysql: 4.1.5 (source)

Please let me know if I can provide more info.

Contacting me might be difficult without an email address :-)
mysql@humilis.net

This one may have to be reopened, because it's still happening more or less exactly as described by Don.

I've tried 4.0.23, 4.1.9, 4.1.10 on two dual opterons with 8 gig ram running Debian 3.1 pure64 gcc3.4. Tried kernels 2.6.11-rc1-mm1 and 2.6.11-rc2. The servers are/were running as a replication master/slave pair.

The master (or standalone) server required quite a bit of poking to get it to hang in a test environment, or just a few short hours running as a production server. I haven't been able to narrow it down to one specific thing, but it appears to have something to do with indexes, because it ran fine after dropping all of them.

I hacked up a test script that's throwing a random selection of selects and inserts at it at random intervals between 0 and 2 seconds, from 20 concurrent threads.

I could trigger it by running a "repair table" on it while the test script was running. The repair thread would wait for its turn, then lock everything else out and do its thing, and when it was done the first insert after that would hang indefinitely. (Left it hanging over the weekend, nothing.)

Adding "skip-concurrent-insert" fixes the problem, but cripples the performance so that's not really an option.

Then tried to run one of the amd64 boxes as a slave off of a xeon box: Same problem, the replication thread hangs within seconds of starting the server, and again won't budge for anything except kill -9. This is on a server without _any_ other connections except for a processlist.

Could you submit it as a new bug please ?
It looks like REPAIR bug, not the one that was closed.

Ok, I've entered a new one, bug 8555.