Bug #13499 Sytem locks on systems with skip_external_locking=ON
Submitted: 26 Sep 2005 20:19 Modified: 28 Nov 2005 8:37
Reporter: Gregert Johnson Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:4.0.18-standard OS:Linux (SuSE 9.1PRO Lnx Kernel 2.6.4-52)
Assigned to: CPU Architecture:Any

[26 Sep 2005 20:19] Gregert Johnson
Description:
We have a production system in the field on which MySQL occasionally locks up on a very simple query on an otherwise quiescent system.  The "show processlist" command reports that the connection in question is in the "System lock" state.  The reference manual states that this means that "the thread is waiting to get an external system lock for the table."  However, this system is not using external locking, so that external locking is presumably disabled.  When this happens, the only way to recover seems to be to restart the mysqld server.

Only MyISAM tables are being used.

"show processlist" output:

Id	User	Host	db	Command	Time	State	Info
12910	hmapp	localhost:43056	hammer_monitor	Query	2399	System lock	SELECT endpt_key FROM endpoint WHERE endpt_id = 'aaln/1@stlmo134024907'
12911	hmapp	localhost:43057	hammer_monitor	Sleep	11		NULL
17430	hmapp	localhost:43181	hammer_monitor	Sleep	30		NULL
19008	root	localhost	NULL	Query	0	NULL	show processlist

Table definition:

desc endpoint;
+-----------+------------------+------+-----+---------+----------------+
| Field     | Type             | Null | Key | Default | Extra          |
+-----------+------------------+------+-----+---------+----------------+
| endpt_key | int(10) unsigned |      | PRI | NULL    | auto_increment |
| endpt_id  | varchar(80)      |      | UNI |         |                |
+-----------+------------------+------+-----+---------+----------------+

Locking status:  we do not have the skip-external-locking parameter set in my.cnf, but that is evidently the default on this system:

show variables like '%locking%;
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| skip_external_locking | ON    |
+-----------------------+-------+

How to repeat:
We have not been able to reproduce this condition in our lab.  It happens in the field after several days of uptime.

Suggested fix:
Unknown.
[28 Sep 2005 10:06] Valeriy Kravchuk
Thank you for a bug report.

I want to explore all possible code paths that can lead to the "System lock" state, but would like to do it on a latest code base.

Have you tried to use newer versions, in particular, 4.0.26? Please, try to use it and report if the described behaviour still present.

Any additional information on how to repeat this situation by a simple sequence of actions is also welcomed.
[28 Sep 2005 16:19] Gregert Johnson
I shall see about upgrading the version to 4.0.26, although we'll have to deal with the fact that the system in question is deployed at a customer site.

In the meantime, here is more information about the event:

The mysql connections were monitored every 60 seconds via the "show processlist" command over a period of 7 days.  The "System lock" state appeared after 6646 minutes, or about 5 days.  The query in question is executed very frequently on the same connection (possibly several times a second), along with a number of multi-row INSERTS into other tables.  Over the 5 day period the query was observed to be in-progress in 69 out the the 6646 monitoring points.  The "State" information reported for these 69 "show processlist" outputs was as follows:

    cleaning up: 1
    end4: 1
    init: 2
    NULL: 2
    Opening tables: 1
    query end: 1
    Sending data: 1
    statistics: 56
    Writing to net: 4

The size of the table involved in the query is not large, under 40,000 rows.  It is a simple (non-merge) MyISAM table with two indexes (primary + another unique index).  Operations are performed on several other tables on the same connection, some of which are large MERGE tables, with UNION component tables of a million rows each.  They do not seem to be involved in the "System lock" condition, however.
[7 Oct 2005 14:14] Valeriy Kravchuk
The only place in code where "System lock" state is set is in sql/lock.cc file (line 133 in 4.1, for example). Looks like it is tried each time server tries to lock a table (it is mysql_lock_tables function). So, looks like your skip-external-locking parameter setting does not matter and does not prevent this state, in theory.

There is also the following note in the manual (http://dev.mysql.com/doc/mysql/en/system.html):

"Note that the --skip-external-locking  option does not affect MySQL's functionality as long as you run only one server. Just remember to take down the server (or lock and flush the relevant tables) before you run myisamchk. On some systems this option is mandatory, because the external locking does not work in any case."

Do you have only one server working with these tables (endpoint)? 

Are there any foreign keys referencing to this MyISAM table? If yes, then it (locking) is understandable (you have many simultaneous inserts into them...).

Can you try to convert this table to InnoDB?

Did you have a chance to work on 4.0.26?
[8 Oct 2005 16:19] Sergei Golubchik
Perhaps the manual is not 100% precise here. --skip-external-locking does not prevent "System lock" state, as this state means more than just external system lock - it also includes internal storage engine locks.

Storage engine locks may rely on external system locks - that is flock/locks/etc - and the latter won't be called if --skip-external-locking is used. But usually storage engine needs to do much more than just simple flock(), so MySQL needs to call storage engine locking functions anyway, no matter whether --skip-external-locking was used or not.
[27 Oct 2005 16:47] Gregert Johnson
Mr. Kravchuk: we have only one MySQL server running on the machine in question, and, since all of our tables use the MyISAM engine, foreign key constraints are not involved (or supported).  We've not yet been able to upgrade the system to 4.0.26 as of yet, so I don't have any comparative behavior to report.

Mr. Golubchik: thank you very much for the clarification regarding the "System lock" state.  I was in fact very confused (and concerned) by the manual's implication that this had necessarily to do with external locking.

So, the question remains: under what conditions could a MySQL connection wind up in this state, executing a very simple query on a single MyISAM table, when absolutely nothing is being done on any other connections?
[28 Oct 2005 8:37] Valeriy Kravchuk
Please, provide the additional information (my.cnf, error log) Mark asked you about in the associated issue. I hope, he will be able to answer your question about the reason of 'System lock' state for such a long time.

In frames of this bug report we have to create a repeatable test case, if possible. So, please, describe also your hardware, send the results of ulimit -a command executed as a user running MySQL server. The results of df -k may be also useful.

Then I'll try to simulate your problem in the test environment.
[29 Nov 2005 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[26 Apr 2006 14:15] Brian O'Donnell
We're suffering what appears to be the same problem - MySQL 4.1.10 coming up with 'system locks' which appears to then slow everything down.  We have a single installation of MySQL with MyISAM tables, InnoDB disabled.

I've detailed our issues on the MyISAM forum 
http://forums.mysql.com/read.php?21,85701,85701#msg-85701

would appreciate any info!

Cheers
Brian