Bug #49780 innodb engine hang with long semaphore wait warning in mysqld.log
Submitted: 17 Dec 2009 19:30 Modified: 15 Feb 2011 18:48
Reporter: Max Vohlken Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.1.38 OS:Linux (x86 RedHat AS 4)
Assigned to: CPU Architecture:Any
Tags: hang, innodb, long semaphore wait

[17 Dec 2009 19:30] Max Vohlken
Description:
We have been experiencing MySQL hangs where transactions just stop being processed. In the mysqld log file there are entries like:

InnoDB: Warning: a long semaphore wait:
--Thread 3124128 has waited at ../../storage/innobase/include/log0log.ic line 311 for 0.00 seconds the semaphore:
Mutex at 0xb0d138b4 created file log/log0log.c line 738, lock var 0
waiters flag 0
wait has ended

We are using MySQL as the backend database for Rational Build Forge. We experience this problem once or twice a week. We have been able to get the engine to free up by killing one or more of the Build Forge processes. I have been unable to find any pattern in the queries that are running that would point to a problem with the application.

Please help us determine if this is a MySQL bug or a problem with the application. Are there any tools or settings that would help gather more information about what is causing the long semaphore wait.

Also we first experienced the issue with 5.0.67 and have since tried 5.0.84 and now 5.1.38. The issue has persisted.

How to repeat:
I don't have a simple use case to reproduce the problem.
[17 Dec 2009 19:31] Max Vohlken
Here is the output from "show global variable":

Variable_name	Value
auto_increment_increment	1
auto_increment_offset	1
autocommit	ON
automatic_sp_privileges	ON
back_log	50
basedir	/opt/rational/buildforge/console/mysql-5.1.38-linux-i686-icc-glibc23/
big_tables	OFF
binlog_cache_size	1048576
binlog_format	STATEMENT
bulk_insert_buffer_size	67108864
character_set_client	latin1
character_set_connection	latin1
character_set_database	latin1
character_set_filesystem	binary
character_set_results	latin1
character_set_server	latin1
character_set_system	utf8
character_sets_dir	/opt/rational/buildforge/console/mysql-5.1.38-linux-i686-icc-glibc23/share/charsets/
collation_connection	latin1_swedish_ci
collation_database	latin1_swedish_ci
collation_server	latin1_swedish_ci
completion_type	0
concurrent_insert	1
connect_timeout	10
datadir	/opt/rational/buildforge/console/mysql.data/
date_format	%Y-%m-%d
datetime_format	%Y-%m-%d %H:%i:%s
default_week_format	0
delay_key_write	ON
delayed_insert_limit	100
delayed_insert_timeout	300
delayed_queue_size	1000
div_precision_increment	4
engine_condition_pushdown	ON
error_count	0
event_scheduler	OFF
expire_logs_days	2
flush	OFF
flush_time	0
foreign_key_checks	ON
ft_boolean_syntax	+ -><()~*:""&|
ft_max_word_len	84
ft_min_word_len	4
ft_query_expansion_limit	20
ft_stopword_file	(built-in)
general_log	OFF
general_log_file	/var/run/mysqld/mysqld.log
group_concat_max_len	1024
have_community_features	YES
have_compress	YES
have_crypt	YES
have_csv	YES
have_dynamic_loading	YES
have_geometry	YES
have_innodb	YES
have_ndbcluster	NO
have_openssl	NO
have_partitioning	YES
have_query_cache	YES
have_rtree_keys	YES
have_ssl	NO
have_symlink	YES
hostname	sully
identity	0
ignore_builtin_innodb	OFF
init_connect	
init_file	
init_slave	
innodb_adaptive_hash_index	ON
innodb_additional_mem_pool_size	16777216
innodb_autoextend_increment	8
innodb_autoinc_lock_mode	1
innodb_buffer_pool_size	1073741824
innodb_checksums	ON
innodb_commit_concurrency	0
innodb_concurrency_tickets	500
innodb_data_file_path	ibdata1:10M:autoextend
innodb_data_home_dir	
innodb_doublewrite	ON
innodb_fast_shutdown	1
innodb_file_io_threads	4
innodb_file_per_table	ON
innodb_flush_log_at_trx_commit	1
innodb_flush_method	
innodb_force_recovery	0
innodb_lock_wait_timeout	120
innodb_locks_unsafe_for_binlog	OFF
innodb_log_buffer_size	8388608
innodb_log_file_size	268435456
innodb_log_files_in_group	3
innodb_log_group_home_dir	./
innodb_max_dirty_pages_pct	90
innodb_max_purge_lag	100000
innodb_mirrored_log_groups	1
innodb_open_files	300
innodb_rollback_on_timeout	OFF
innodb_stats_on_metadata	ON
innodb_support_xa	ON
innodb_sync_spin_loops	20
innodb_table_locks	ON
innodb_thread_concurrency	16
innodb_thread_sleep_delay	10000
innodb_use_legacy_cardinality_algorithm	ON
insert_id	0
interactive_timeout	28800
join_buffer_size	8388608
keep_files_on_create	OFF
key_buffer_size	33554432
key_cache_age_threshold	300
key_cache_block_size	1024
key_cache_division_limit	100
language	/opt/rational/buildforge/console/mysql-5.1.38-linux-i686-icc-glibc23/share/english/
large_files_support	ON
large_page_size	0
large_pages	OFF
last_insert_id	0
lc_time_names	en_US
license	GPL
local_infile	ON
locked_in_memory	OFF
log	OFF
log_bin	ON
log_bin_trust_function_creators	OFF
log_bin_trust_routine_creators	OFF
log_error	/opt/rational/buildforge/console/logs/mysqld.log
log_output	FILE
log_queries_not_using_indexes	OFF
log_slave_updates	OFF
log_slow_queries	OFF
log_warnings	4
long_query_time	3.000000
low_priority_updates	OFF
lower_case_file_system	OFF
lower_case_table_names	0
max_allowed_packet	16777216
max_binlog_cache_size	4294963200
max_binlog_size	1073741824
max_connect_errors	10
max_connections	120
max_delayed_threads	20
max_error_count	64
max_heap_table_size	67108864
max_insert_delayed_threads	20
max_join_size	18446744073709551615
max_length_for_sort_data	1024
max_prepared_stmt_count	16382
max_relay_log_size	0
max_seeks_for_key	4294967295
max_sort_length	1024
max_sp_recursion_depth	0
max_tmp_tables	32
max_user_connections	0
max_write_lock_count	4294967295
min_examined_row_limit	0
multi_range_count	256
myisam_data_pointer_size	6
myisam_max_sort_file_size	10737418240
myisam_recover_options	DEFAULT
myisam_repair_threads	1
myisam_sort_buffer_size	134217728
myisam_stats_method	nulls_unequal
myisam_use_mmap	OFF
net_buffer_length	16384
net_read_timeout	30
net_retry_count	10
net_write_timeout	60
new	OFF
old	OFF
old_alter_table	OFF
old_passwords	OFF
open_files_limit	8192
optimizer_prune_level	1
optimizer_search_depth	62
optimizer_switch	index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on
pid_file	/var/run/mysqld/mysqld.pid
plugin_dir	/opt/rational/buildforge/console/mysql-5.1.38-linux-i686-icc-glibc23/lib/plugin
port	3306
preload_buffer_size	32768
profiling	OFF
profiling_history_size	15
protocol_version	10
pseudo_thread_id	0
query_alloc_block_size	8192
query_cache_limit	2097152
query_cache_min_res_unit	4096
query_cache_size	67108864
query_cache_type	ON
query_cache_wlock_invalidate	OFF
query_prealloc_size	8192
rand_seed1	
rand_seed2	
range_alloc_block_size	4096
read_buffer_size	2097152
read_only	OFF
read_rnd_buffer_size	16777216
relay_log	
relay_log_index	
relay_log_info_file	relay-log.info
relay_log_purge	ON
relay_log_space_limit	0
report_host	
report_password	
report_port	3306
report_user	
rpl_recovery_rank	0
secure_auth	OFF
secure_file_priv	
server_id	1
skip_external_locking	ON
skip_networking	OFF
skip_show_database	OFF
slave_compressed_protocol	OFF
slave_exec_mode	STRICT
slave_load_tmpdir	/tmp
slave_net_timeout	3600
slave_skip_errors	OFF
slave_transaction_retries	10
slow_launch_time	2
slow_query_log	OFF
slow_query_log_file	/opt/rational/buildforge/console/logs/mysqld.slow.log
socket	/tmp/mysql.sock
sort_buffer_size	8388608
sql_auto_is_null	ON
sql_big_selects	ON
sql_big_tables	OFF
sql_buffer_result	OFF
sql_log_bin	ON
sql_log_off	OFF
sql_log_update	ON
sql_low_priority_updates	OFF
sql_max_join_size	18446744073709551615
sql_mode	
sql_notes	ON
sql_quote_show_create	ON
sql_safe_updates	OFF
sql_select_limit	18446744073709551615
sql_slave_skip_counter	
sql_warnings	OFF
ssl_ca	
ssl_capath	
ssl_cert	
ssl_cipher	
ssl_key	
storage_engine	InnoDB
sync_binlog	0
sync_frm	ON
system_time_zone	EST
table_definition_cache	256
table_lock_wait_timeout	50
table_open_cache	2048
table_type	InnoDB
thread_cache_size	8
thread_handling	one-thread-per-connection
thread_stack	196608
time_format	%H:%i:%s
time_zone	SYSTEM
timed_mutexes	OFF
timestamp	1261077826
tmp_table_size	67108864
tmpdir	/tmp
transaction_alloc_block_size	8192
transaction_prealloc_size	4096
tx_isolation	REPEATABLE-READ
unique_checks	ON
updatable_views_with_limit	YES
version	5.1.38-log
version_comment	MySQL Community Server (GPL)
version_compile_machine	i686
version_compile_os	pc-linux-gnu
wait_timeout	28800
warning_count	0

I just changed the innodb_max_purge_lag from 0 to 100000 based on the suggestions in another bug report. It is too soon to tell if that has helped.
[18 Dec 2009 10:27] Valeriy Kravchuk
This wait happens in log_reserve_and_write_fast() fucntion. I see this:

InnoDB: Warning: a long semaphore wait:
--Thread 4778912 has waited at ../../storage/innobase/include/log0log.ic line 311 for 3421.00 seconds the semaphore:
Mutex at 0xb0d138b4 created file log/log0log.c line 738, lock var 1
waiters flag 0
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
InnoDB: Pending preads 1, pwrites 0

and later:

--------
FILE I/O
--------
I/O thread 0 state: waiting for i/o request (insert buffer thread)
I/O thread 1 state: waiting for i/o request (log thread)
I/O thread 2 state: waiting for i/o request (read thread)
I/O thread 3 state: waiting for i/o request (write thread)
Pending normal aio reads: 0, aio writes: 0,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
2154356 OS file reads, 173835257 OS file writes, 172037680 OS fsyncs
0.94 reads/s, 16384 avg bytes/read, 778.01 writes/s, 777.51 fsyncs/s

...

---
LOG
---
Log sequence number 758 4098433219
Log flushed up to   758 4098433219
Last checkpoint at  758 3529617112
0 pending log writes, 0 pending chkp writes
171302136 log i/o's done, 776.76 log i/o's/second

So, looks like you have too many fsync() operations for the hardware. How many CPUs/cores do you have? What disk subsystem do you have?

Please, try to get the results of

vmstat 1 30

next time you'll note long waits like this (if you'll catch it before the crash).

The results of SHOW GLOBAL STATUS will be also useful.
[18 Dec 2009 17:52] Max Vohlken
Info about our system:

$ uname -a
Linux sully 2.6.9-67.0.7.ELsmp #1 SMP Wed Feb 27 04:48:20 EST 2008 i686 i686 i386 GNU/Linux

8 CPUs - Intel(R) Xeon(TM) CPU 3.00GHz
8 GB of RAM

The database files are stored on a mounted NFS filesystem from a NetApp filer.
[18 Dec 2009 17:54] Max Vohlken
So what controls the number of fsync() operations and is this something that is configurable?
[18 Dec 2009 19:01] Valeriy Kravchuk
Please, send the results of

free

Linux command. As you have 32-bit OS, mysqld process can not use more than 3G of memory in total. I'd like to check if we can allocate some more memory to innodb_buffer_pool. This may help to reduce file I/O. 

As for log I/O, do you have any batter-backed write cache in your NetApp storage?
[18 Dec 2009 21:47] Max Vohlken
The output from free:

             total       used       free     shared    buffers     cached
Mem:       8308576    5054096    3254480          0       2764     898656
-/+ buffers/cache:    4152676    4155900
Swap:      4194296          0    4194296

Also top says the virtual memory value for mysqld is 1364m.
[2 Jan 2010 12:26] Valeriy Kravchuk
So, you have:

innodb_buffer_pool_size	1073741824

and there are 3+G of RAM that is still free. Please, try to set

innodb_buffer_pool_size = 1400M

and check if it will help to prevent long semaphore waits.
[3 Feb 2010 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[3 Feb 2010 4:55] Max Vohlken
I increased the innodb_buffer_pool_size to 2G and the long semaphore wait still occurred. I decided to move the database files to local storage instead of NFS mounted storage from a NetApp. So far we have not gotten any more long semaphore waits. Unfortunately storing the database files on the netapp was key to our distaster recovery plan. So do you have any ideas if this is a problem with the NFS client in Linux or a problem with the NFS server supplied by the NetApp? Is storing of the database files on an NFS filesystem considered a supported configuration?
[5 Feb 2010 6:47] Sveta Smirnova
Thank you for the feedback.

> So do you have any ideas if this is a problem with the NFS client in Linux or a problem with the NFS server supplied by the NetApp? Is storing of the database files on an NFS filesystem
considered a supported configuration?

Yes, using NFS is supported. Even more: we use NFS in our daily work.
[5 Feb 2010 7:26] Sveta Smirnova
I searched our internal knowledge base for this problem.

Typical problem with MySQL on NFS is locking. I am afraid we can do nothing to fix this at our side if this is the case. If you are sure nobody else uses MySQL data files, please try mount NFS volumes there you store them with locking turned off.
[6 Mar 2010 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[7 Feb 2011 8:14] Marko Mäkelä
If we had the stack traces from all threads during the hang (or crash), we could tell if this a duplicate of Bug #59733.
[13 Feb 2011 19:10] Max Vohlken
We upgraded to Red Hat Enterprise Linux AS release 4 (Nahant Update 8) and MySQL version 5.1.46 and that seems to have fixed the problem.
[15 Feb 2011 18:48] Sveta Smirnova
Thank you for the feedback.

Closed as "Can't repeat" because last comment.