MySQL Bugs: #74003: The server was crashed because of long semaphore wait

Bug #74003	The server was crashed because of long semaphore wait
Submitted:	22 Sep 2014 3:08	Modified:	25 Sep 2018 12:27
Reporter:	zhai weixiang (OCA)	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server	Severity:	S3 (Non-critical)
Version:	5.6.16	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
A mysql instance was crashed because of long semaphore wait . There's no pending write/read..And the workload is very low..

Quoted from error file:
--Thread 47097283892992 has waited at srv0srv.cc line 2651 for 0.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x2aaf86c28e78 created in file trx0purge.cc line 128
a writer (thread id 47097283892992) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 0, lock_word: ffffffffffffffff
Last time read locked in file trx0rec.cc line 1457
Last time write locked in file storage/innobase/trx/trx0purge.cc line 1199
OS WAIT ARRAY INFO: reservation count 30099189
--Thread 47094253504256 has waited at buf0buf.cc line 4340 for 0.00 seconds the semaphore:
Mutex at 0x2aaf86fd8068 created file buf0buf.cc line 1381, lock var 1
waiters flag 1
OS WAIT ARRAY INFO: reservation count 30099189
OS WAIT ARRAY INFO: reservation count 30099189
--Thread 47097281791744 has waited at buf0lru.cc line 1251 for 0.00 seconds the semaphore:
Mutex at 0x2aaf86fd8068 created file buf0buf.cc line 1381, lock var 1
waiters flag 1
OS WAIT ARRAY INFO: reservation count 30099188
OS WAIT ARRAY INFO: reservation count 30099188
OS WAIT ARRAY INFO: reservation count 30099188
OS WAIT ARRAY INFO: reservation count 30099188
--Thread 47097311270656 has waited at btr0cur.cc line 277 for 951.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x2aaf9005a2c0 created in file buf0buf.cc line 1132
a writer (thread id 47097311270656) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffffffffffff
Last time read locked in file btr0cur.cc line 277
Last time write locked in file storage/innobase/btr/btr0cur.cc line 277
OS WAIT ARRAY INFO: reservation count 30099188
--Thread 47097308628736 has waited at btr0cur.cc line 277 for 951.00 seconds the semaphore:
S-lock on RW-latch at 0x2aaf9005a2c0 created in file buf0buf.cc line 1132
a writer (thread id 47097311270656) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffffffffffff
Last time read locked in file btr0cur.cc line 277
Last time write locked in file storage/innobase/btr/btr0cur.cc line 277
OS WAIT ARRAY INFO: reservation count 30099188
--Thread 47097292297984 has waited at buf0flu.cc line 1064 for 951.00 seconds the semaphore:
S-lock on RW-latch at 0x2aaf9005a2c0 created in file buf0buf.cc line 1132
a writer (thread id 47097311270656) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffffffffffff
Last time read locked in file btr0cur.cc line 277
Last time write locked in file storage/innobase/btr/btr0cur.cc line 277
OS WAIT ARRAY INFO: reservation count 30099188
--Thread 47097323423488 has waited at btr0cur.cc line 277 for 951.00 seconds the semaphore:
X-lock on RW-latch at 0x2aaf9005a2c0 created in file buf0buf.cc line 1132
a writer (thread id 47097311270656) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffffffffffff
Last time read locked in file btr0cur.cc line 277
Last time write locked in file storage/innobase/btr/btr0cur.cc line 277
OS WAIT ARRAY INFO: reservation count 30099188
OS WAIT ARRAY INFO: reservation count 30099188
OS WAIT ARRAY INFO: reservation count 30099188
OS WAIT ARRAY INFO: reservation count 30099188
OS WAIT ARRAY INFO: signal count 314870931
Mutex spin waits 9988650358, rounds 40320022702, OS waits 392040678
RW-shared spins 101883260, rounds 1756875879, OS waits 40149483
RW-excl spins 59078932, rounds 1082283734, OS waits 19814745
Spin rounds per wait: 28.83 mutex, 17.24 RW-shared, 18.32 RW-excl

Active transaction:
MySQL thread id 11, OS thread handle 0x2ad5b2c29700, query id 3877031727 127.0.0.1 root cleaning up
---TRANSACTION 36432877774, ACTIVE 956 sec
2 lock struct(s), heap size 360, 1 row lock(s), undo log entries 1
MySQL thread id 665364, OS thread handle 0x2ad5b3fd0700, query id 3877030539 Waiting for an event from Coordinator
---TRANSACTION 36432877772, ACTIVE 956 sec starting index read
mysql tables in use 1, locked 1
5 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 2
MySQL thread id 665366, OS thread handle 0x2ad5b259c700, query id 3877030541 System lock
---TRANSACTION 36432877768, ACTIVE 956 sec updating or deleting
mysql tables in use 1, locked 1
6 lock struct(s), heap size 1184, 3 row lock(s), undo log entries 2
MySQL thread id 665371, OS thread handle 0x2ad5b2821700, query id 3877030531 System lock
---TRANSACTION 36432877767, ACTIVE 956 sec updating or deleting
mysql tables in use 1, locked 1
8 lock struct(s), heap size 1184, 3 row lock(s), undo log entries 4
MySQL thread id 665372, OS thread handle 0x2ad5b33b8700, query id 3877030544 System lock
---TRANSACTION 36355211130, ACTIVE 43348 sec fetching rows, thread declared inside InnoDB 4986
mysql tables in use 1, locked 0
MySQL thread id 665379, OS thread handle 0x2ad5b45dc700, query id 3877030495 127.0.0.1 root Sending data
SELECT /*!40001 SQL_NO_CACHE */ * FROM `slave_worker_info`
Trx read view will not see trx with id >= 36355211131, sees < 36355211131

How to repeat:
I don't know

Suggested fix:
I don't know

I think it will be a great help if the mysqld can print backtraces of all threads while crashed because of long semaphore wait... :(

We found the exact reason toady:

1.  Run mysqldump on slave. mysqldump  opened a readview and start to `SELECT` all tables. But this instance is very large. It need a long time to backup all datasets. The readview became very old.

2.The worker threads frequently update the slave_worker_info table. So the undo list became very long and can't be purged because there the data dump is not finished.

3.Finally mysqldump began to dump SLAVE_WORKER_INFO table, and tried to built old version of the record, HOLDING "S" lock ON the block.

4. The worker threads tried to update  SLAVE_WORKER_INFO , but can not acquire X LOCK of the related block.

5.  A long semaphore waiting happened and the server was crashed...

Some advice:
1. For mysqldump, dump tables in mysql database first and then others..
2. For innodb, release S or X lock on the page while taking too long time to build old version of the record and then retry..

Why not add backtrace dumping before aborting due to long semaphore waiting? Just like:

diff --git a/storage/innobase/srv/srv0srv.cc b/storage/innobase/srv/srv0srv.cc
index cf8b288..26d97ff 100644
--- a/storage/innobase/srv/srv0srv.cc
+++ b/storage/innobase/srv/srv0srv.cc
@@ -1661,6 +1661,25 @@ exit_func:
 }
 
 /*********************************************************************//**
+Dump the backtrace for self for debugging.
+*/
+UNIV_INTERN
+void dump_backtrace()
+{
+  pid_t pid;
+  char cmd[512];
+  pid = getpid();
+  snprintf(cmd, 512,
+      "gdb -ex 'set pagination 0' -ex 'thread apply all bt' -batch -p %d",
+      pid);
+  fprintf(stderr,
+      "Start dumping backtrace for self with pid=%d\n", pid);
+  system(cmd);
+  fprintf(stderr,
+      "Finish dumping backtrace\n");
+}
+
+/*********************************************************************//**
 A thread which prints warnings about semaphore waits which have lasted
 too long. These can be used to track bugs which cause hangs.
 @return	a dummy parameter */
@@ -1739,7 +1758,6 @@ loop:
 	    && sema == old_sema && os_thread_eq(waiter, old_waiter)) {
 		fatal_cnt++;
 		if (fatal_cnt > 10) {
-
 			fprintf(stderr,
 				"InnoDB: Error: semaphore wait has lasted"
 				" > %lu seconds\n"
@@ -1747,6 +1765,8 @@ loop:
 				" because it appears to be hung.\n",
 				(ulong) srv_fatal_semaphore_wait_threshold);
 
+                        dump_backtrace();
+
 			ut_error;
 		}
 	} else {

At least I think mysqldump can be modified to always dump mysql database first to avoid this problem.

I think dumping mysql database first would help prevent this specific issue.

But what if some user is doing millions of updates to his zzz database tables, then i suppose you'd meet similar symptoms? (also note http://bugs.mysql.com/bug.php?id=54455 )

Both have the same root cause.  Modifying mysqldump is simple.  Fixing the root cause is complicate (possibly)   :) 

This is a real life problem. We backup data using mysqldump on slave and the dataset per instance is usually very very large (so may need several days to finish the backup) .

Hi,

Thank you for your bug report.

First of all, long semaphore waits are quite common. In some cases it is due to a bug, but in some other cases it is a problem with misconfiguration.

What is most important is that you have reported this with one very, very old version / release. Can you please tell us whether you observe the same behaviour with latest 5.7 or 8.0 ???

Thanks in advance.

I assume any table that is small and frequently updated has a chance of reproducing this problem. Dumping the mysql tables first helps but won't avoid the problems with user tables.

Hi,

Beside other info that I requested, we would be grateful to have a repeatable test case. It could be a descriptive one, but if we fail to repeat it, then we would need a full-fledged test case.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".