MySQL Bugs: #29068: InnoDB long semaphore wait brings down server

Bug #29068	InnoDB long semaphore wait brings down server
Submitted:	13 Jun 2007 10:38	Modified:	27 Jul 2007 15:32
Reporter:	Mika Raento	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S2 (Serious)
Version:	5.0.32	OS:	Linux
Assigned to:	Assigned Account	CPU Architecture:	Any
Tags:	innodb, semaphore wait

Description:
Database under heavy load hangs and and finally shuts down with several

InnoDB: Warning: a long semaphore wait:
--Thread 1557134256 has waited at btr0sea.c line 489 for 786.00 seconds the semaphore:
X-lock on RW-latch at 0xb68dcb68 created in file btr0sea.c line 139
a writer (thread id 1557134256) has reserved it in mode  wait exclusive

I know this is similar to #27971, #26834 and #25645 but since it isn't on 64bit or FreeBSD, not due to out-of-memory and has different line numbers for the read and write lock I'm hoping this provides useful additional information.

error log attached (from syslog, as this is the debian build)

How to repeat:
Don't know.

Error log

Attachment: mysql-crash-error-log (application/octet-stream, text), 11.44 KiB.

Same as above, but with .txt so that it can be viewed easier

Attachment: mysql-crash-error-log.txt (text/plain), 11.44 KiB.

The backtrace doesn't seem useful as it is

0x81c0649 handle_segfault + 681
0x82ee280 srv_error_monitor_thread + 464
0xb7ef10bd _end + -1350531603
0xb7d2b93e _end + -1352389010

which is just the SEGV handler called because InnoDB triggered a SEGV.

Thank you for a problem report. Please, try to repeat with a newer version, 5.0.41/5.0.42. In case of the same problem, please, send your my.cnf and the results of SHOW INNODB STATUS under usual load.

We have only seen this once so far, so can't really repeat anyway that I know of.

Are there known changes in 5.0.42 for this issue? Of the bugs i found, only #25645 had a patch, and that was in 'Patch approved' state. 

Upgrading to 5.0.42 just for this isn't an option, I was more hoping that the erro r log would help you figure out what the bug is.

No way to repeat the problem... OK. Please, send your my.cnf or SHOW VARIABLES results then. How many CPUs do you have on that machine?

my.cnf

Attachment: my.cnf (application/octet-stream, text), 1.09 KiB.

1 dual-core Xeon

I think, this setting:

thread_cache_size	= 128

may be a reason for your problem. But we need a repeatable test case to verify that. You may just check usual SHOW INNODBN STATUS results for waits with thread_cache_size=0 vs. current value.

The 'non-repeatability' just took a turn for the worse - db hung again.

I set the thread_cache_size to 0, currently running myisamchk since there seems to be some tables still myisam.

If and when the hang occurs, can I somehow dump relevant data from mysqld? (like forcing a core dump - can't connect to mysql when that happens)

Otherewise, how would I go about logging enough queries and data to try to recreate?

ah, on second thought the inability to log in was most likely due the db already shutting down itself, rather than the hang.

Mika,

do you have the entire .err log, including the output to stdout?

Some threads are waiting for the InnoDB 'kernel mutex', some are waiting for the InnoDB adaptive hash index latch. I wonder what code path could lead to a thread holding the kernel mutex to access the adaptive hash latch.

This might be a bug in InnoDB's /sync, or memory corruption, or an OS bug or a hardware problem.

Having the entire .err log would help.

sync0sync.h in 5.0.32:

#define SYNC_KERNEL             300
#define SYNC_REC_LOCK           299
#define SYNC_TRX_LOCK_HEAP      298
#define SYNC_TRX_SYS_HEADER     290
#define SYNC_LOG                170
#define SYNC_RECV               168
#define SYNC_SEARCH_SYS         160     /* NOTE that if we have a memory
                                        heap that can be extended to the
                                        buffer pool, its logical level is
                                        SYNC_SEARCH_SYS, as memory allocation
                                        can call routines there! Otherwise
                                        the level is SYNC_MEM_HASH. */
#define SYNC_BUF_POOL           150
#define SYNC_BUF_BLOCK          149
#define SYNC_DOUBLEWRITE        140
#define SYNC_ANY_LATCH          135
#define SYNC_THR_LOCAL          133
#define SYNC_MEM_HASH           131
#define SYNC_MEM_POOL           130

Regards,

Heikki

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

If innodb_file_per_table=1 was not used, this could be a duplicate of Bug #59733.