Bug #74003 The server was crashed because of long semaphore wait
Submitted: 22 Sep 2014 3:08 Modified: 25 Sep 2018 12:27
Reporter: zhai weixiang (OCA) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.6.16 OS:Any
Assigned to: CPU Architecture:Any

[22 Sep 2014 3:08] zhai weixiang
Description:
A mysql instance was crashed because of long semaphore wait . There's no pending write/read..And the workload is very low..

Quoted from error file:
--Thread 47097283892992 has waited at srv0srv.cc line 2651 for 0.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x2aaf86c28e78 created in file trx0purge.cc line 128
a writer (thread id 47097283892992) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 0, lock_word: ffffffffffffffff
Last time read locked in file trx0rec.cc line 1457
Last time write locked in file storage/innobase/trx/trx0purge.cc line 1199
OS WAIT ARRAY INFO: reservation count 30099189
--Thread 47094253504256 has waited at buf0buf.cc line 4340 for 0.00 seconds the semaphore:
Mutex at 0x2aaf86fd8068 created file buf0buf.cc line 1381, lock var 1
waiters flag 1
OS WAIT ARRAY INFO: reservation count 30099189
OS WAIT ARRAY INFO: reservation count 30099189
--Thread 47097281791744 has waited at buf0lru.cc line 1251 for 0.00 seconds the semaphore:
Mutex at 0x2aaf86fd8068 created file buf0buf.cc line 1381, lock var 1
waiters flag 1
OS WAIT ARRAY INFO: reservation count 30099188
OS WAIT ARRAY INFO: reservation count 30099188
OS WAIT ARRAY INFO: reservation count 30099188
OS WAIT ARRAY INFO: reservation count 30099188
--Thread 47097311270656 has waited at btr0cur.cc line 277 for 951.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x2aaf9005a2c0 created in file buf0buf.cc line 1132
a writer (thread id 47097311270656) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffffffffffff
Last time read locked in file btr0cur.cc line 277
Last time write locked in file storage/innobase/btr/btr0cur.cc line 277
OS WAIT ARRAY INFO: reservation count 30099188
--Thread 47097308628736 has waited at btr0cur.cc line 277 for 951.00 seconds the semaphore:
S-lock on RW-latch at 0x2aaf9005a2c0 created in file buf0buf.cc line 1132
a writer (thread id 47097311270656) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffffffffffff
Last time read locked in file btr0cur.cc line 277
Last time write locked in file storage/innobase/btr/btr0cur.cc line 277
OS WAIT ARRAY INFO: reservation count 30099188
--Thread 47097292297984 has waited at buf0flu.cc line 1064 for 951.00 seconds the semaphore:
S-lock on RW-latch at 0x2aaf9005a2c0 created in file buf0buf.cc line 1132
a writer (thread id 47097311270656) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffffffffffff
Last time read locked in file btr0cur.cc line 277
Last time write locked in file storage/innobase/btr/btr0cur.cc line 277
OS WAIT ARRAY INFO: reservation count 30099188
--Thread 47097323423488 has waited at btr0cur.cc line 277 for 951.00 seconds the semaphore:
X-lock on RW-latch at 0x2aaf9005a2c0 created in file buf0buf.cc line 1132
a writer (thread id 47097311270656) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffffffffffff
Last time read locked in file btr0cur.cc line 277
Last time write locked in file storage/innobase/btr/btr0cur.cc line 277
OS WAIT ARRAY INFO: reservation count 30099188
OS WAIT ARRAY INFO: reservation count 30099188
OS WAIT ARRAY INFO: reservation count 30099188
OS WAIT ARRAY INFO: reservation count 30099188
OS WAIT ARRAY INFO: signal count 314870931
Mutex spin waits 9988650358, rounds 40320022702, OS waits 392040678
RW-shared spins 101883260, rounds 1756875879, OS waits 40149483
RW-excl spins 59078932, rounds 1082283734, OS waits 19814745
Spin rounds per wait: 28.83 mutex, 17.24 RW-shared, 18.32 RW-excl

Active transaction:
MySQL thread id 11, OS thread handle 0x2ad5b2c29700, query id 3877031727 127.0.0.1 root cleaning up
---TRANSACTION 36432877774, ACTIVE 956 sec
2 lock struct(s), heap size 360, 1 row lock(s), undo log entries 1
MySQL thread id 665364, OS thread handle 0x2ad5b3fd0700, query id 3877030539 Waiting for an event from Coordinator
---TRANSACTION 36432877772, ACTIVE 956 sec starting index read
mysql tables in use 1, locked 1
5 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 2
MySQL thread id 665366, OS thread handle 0x2ad5b259c700, query id 3877030541 System lock
---TRANSACTION 36432877768, ACTIVE 956 sec updating or deleting
mysql tables in use 1, locked 1
6 lock struct(s), heap size 1184, 3 row lock(s), undo log entries 2
MySQL thread id 665371, OS thread handle 0x2ad5b2821700, query id 3877030531 System lock
---TRANSACTION 36432877767, ACTIVE 956 sec updating or deleting
mysql tables in use 1, locked 1
8 lock struct(s), heap size 1184, 3 row lock(s), undo log entries 4
MySQL thread id 665372, OS thread handle 0x2ad5b33b8700, query id 3877030544 System lock
---TRANSACTION 36355211130, ACTIVE 43348 sec fetching rows, thread declared inside InnoDB 4986
mysql tables in use 1, locked 0
MySQL thread id 665379, OS thread handle 0x2ad5b45dc700, query id 3877030495 127.0.0.1 root Sending data
SELECT /*!40001 SQL_NO_CACHE */ * FROM `slave_worker_info`
Trx read view will not see trx with id >= 36355211131, sees < 36355211131

How to repeat:
I don't know

Suggested fix:
I don't know
[22 Sep 2014 3:26] zhai weixiang
I think it will be a great help if the mysqld can print backtraces of all threads while crashed because of long semaphore wait... :(
[24 Sep 2014 7:43] zhai weixiang
We found the exact reason toady:

1.  Run mysqldump on slave. mysqldump  opened a readview and start to `SELECT` all tables. But this instance is very large. It need a long time to backup all datasets. The readview became very old.

2.The worker threads frequently update the slave_worker_info table. So the undo list became very long and can't be purged because there the data dump is not finished.

3.Finally mysqldump began to dump SLAVE_WORKER_INFO table, and tried to built old version of the record, HOLDING "S" lock ON the block.

4. The worker threads tried to update  SLAVE_WORKER_INFO , but can not acquire X LOCK of the related block.

5.  A long semaphore waiting happened and the server was crashed...

Some advice:
1. For mysqldump, dump tables in mysql database first and then others..
2. For innodb, release S or X lock on the page while taking too long time to build old version of the record and then retry..
[25 Sep 2014 3:20] liu hickey
Why not add backtrace dumping before aborting due to long semaphore waiting? Just like:

diff --git a/storage/innobase/srv/srv0srv.cc b/storage/innobase/srv/srv0srv.cc
index cf8b288..26d97ff 100644
--- a/storage/innobase/srv/srv0srv.cc
+++ b/storage/innobase/srv/srv0srv.cc
@@ -1661,6 +1661,25 @@ exit_func:
 }
 
 /*********************************************************************//**
+Dump the backtrace for self for debugging.
+*/
+UNIV_INTERN
+void dump_backtrace()
+{
+  pid_t pid;
+  char cmd[512];
+  pid = getpid();
+  snprintf(cmd, 512,
+      "gdb -ex 'set pagination 0' -ex 'thread apply all bt' -batch -p %d",
+      pid);
+  fprintf(stderr,
+      "Start dumping backtrace for self with pid=%d\n", pid);
+  system(cmd);
+  fprintf(stderr,
+      "Finish dumping backtrace\n");
+}
+
+/*********************************************************************//**
 A thread which prints warnings about semaphore waits which have lasted
 too long. These can be used to track bugs which cause hangs.
 @return	a dummy parameter */
@@ -1739,7 +1758,6 @@ loop:
 	    && sema == old_sema && os_thread_eq(waiter, old_waiter)) {
 		fatal_cnt++;
 		if (fatal_cnt > 10) {
-
 			fprintf(stderr,
 				"InnoDB: Error: semaphore wait has lasted"
 				" > %lu seconds\n"
@@ -1747,6 +1765,8 @@ loop:
 				" because it appears to be hung.\n",
 				(ulong) srv_fatal_semaphore_wait_threshold);
 
+                        dump_backtrace();
+
 			ut_error;
 		}
 	} else {
[10 Oct 2014 5:23] zhai weixiang
At least I think mysqldump can be modified to always dump mysql database first to avoid this problem.
[10 Oct 2014 6:19] MySQL Verification Team
I think dumping mysql database first would help prevent this specific issue.

But what if some user is doing millions of updates to his zzz database tables, then i suppose you'd meet similar symptoms? (also note http://bugs.mysql.com/bug.php?id=54455 )
[10 Oct 2014 6:49] zhai weixiang
Both have the same root cause.  Modifying mysqldump is simple.  Fixing the root cause is complicate (possibly)   :) 

This is a real life problem. We backup data using mysqldump on slave and the dataset per instance is usually very very large (so may need several days to finish the backup) .
[21 Aug 2018 15:31] MySQL Verification Team
Hi,

Thank you for your bug report.

First of all, long semaphore waits are quite common. In some cases it is due to a bug, but in some other cases it is a problem with misconfiguration.

What is most important is that you have reported this with one very, very old version / release. Can you please tell us whether you observe the same behaviour with latest 5.7 or 8.0 ???

Thanks in advance.
[21 Aug 2018 18:38] Mark Callaghan
I assume any table that is small and frequently updated has a chance of reproducing this problem. Dumping the mysql tables first helps but won't avoid the problems with user tables.
[22 Aug 2018 13:20] MySQL Verification Team
Hi,

Beside other info that I requested, we would be grateful to have a repeatable test case. It could be a descriptive one, but if we fail to repeat it, then we would need a full-fledged test case.
[22 Sep 2018 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".