Bug #3596 | crash in innodb with mysql 4.1.2 | ||
---|---|---|---|
Submitted: | 29 Apr 2004 1:55 | Modified: | 15 May 2004 12:36 |
Reporter: | jocelyn fournier (Silver Quality Contributor) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
Version: | 4.0.19, 4.1.2 latest bk tree | OS: | Linux (Linux) |
Assigned to: | Heikki Tuuri | CPU Architecture: | Any |
[29 Apr 2004 1:55]
jocelyn fournier
[3 May 2004 9:13]
Marko Mäkelä
Sorry, I couldn't repeat the problem with the current 4.1 tree. Could you please try it with the latest sources and include a test case for crashing the server?
[3 May 2004 10:26]
jocelyn fournier
Hi, I'm currenlty running mysql with the latest bk version available. If I encounter once again the crash I will try to investage more thoroughly the problem, also it doesn't to be linked to any specific query. BTW I noticed the apparition in my mysql directory of several innodb.status.a_number files I've never seen before, and which seems to have appear since the same time I noticed the crash. What are those files, and could this be linked to my crash ? Thanks, Jocelyn
[3 May 2004 10:51]
Marko Mäkelä
I recently made all InnoDB diagnostic output to use a temporary file rather than string buffers. When MySQL starts, InnoDB creates a file innodb.status.<pid>. When it is shut down normally, InnoDB removes the file. The file should contain the same information as SHOW INNODB STATUS, and it should be updated every 15 seconds. You can safely remove the files, or maybe have a look at them first, in case there is some hint on what is causing the crashes.
[3 May 2004 12:45]
jocelyn fournier
Hi, I've just taken a look in the innodb.status files, and the data loggued are quite strange. It's seems at each update of the log file, the beginning of the log is more and more truncated : Here is some snips of the log file : <beginning of the file> ===================================== 040429 1:24:17 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 16 seconds ---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 5805, signal count 5068 Mutex spin waits 47160, rounds 199865, OS waits 3026 RW-shared spins 3892, OS waits 1876; RW-excl spins 688, OS waits 526 ------------ TRANSACTIONS ------------ Trx id counter 0 578243671 Purge done for trx's n:o < 0 578243587 undo n:o < 0 0 Total number of lock structs in row lock hash table 0 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 0 578243670, not started, process no 31431, OS thread id 36901 MySQL thread id 336051, query id 12118543 localhost mysql ACTION 0 578241032, not started, process no 31466, OS thread id 72776 MySQL thread id 334765, query id 12110969 localhost mysql -------- FILE I/O -------- I/O thread 0 state: waiting for i/o request (insert buffer thread) I/O thread 1 state: waiting for i/o request (log thread) I/O thread 2 state: waiting for i/o request (read thread) I/O thread 3 state: waiting for i/o request (write thread) Pending normal aio reads: 0, aio writes: 0, ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0 Pending flushes (fsync) log: 0; buffer pool: 0 8704 OS file reads, 116195 OS file writes, 28758 OS fsyncs 0.06 reads/s, 16384 avg bytes/read, 2.12 writes/s, 1.06 fsyncs/s ------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf for space 0: size 1, free list len 0, seg size 2, is empty Ibuf for space 0: size 1, free list len 0, seg size 2, 105 inserts, 105 merged recs, 105 merges Hash table size 1106407, used cells 238474, node heap has 265 buffer(s) 151.12 hash searches/s, 10.94 non-hash searches/s --- LOG --- Log sequence number 71 311847382 Log flushed up to 71 311847382 Last checkpoint at 71 311811106 0 pending log writes, 0 pending chkp writes 21582 log i/o's done, 0.81 log i/o's/second ---------------------- BUFFER POOL AND MEMORY ---------------------- Total memory allocated 302986592; in additional pool allocated 992640 Buffer pool size 16384 Free buffers 6514 Database pages 9605 Modified db pages 19 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages read 9562, created 43, written 97144 0.06 reads/s, 0.00 creates/s, 1.25 writes/s Buffer pool hit rate 1000 / 1000 -------------- ROW OPERATIONS -------------- 0 queries inside InnoDB, 0 queries in queue Main thread process no. 31402, id 7176, state: sleeping Number of rows inserted 320, updated 63223, deleted 6, read 22847430 0.00 inserts/s, 1.25 updates/s, 0.00 deletes/s, 158.93 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================ writes 21569 log i/o's done, 1.06 log i/o's/second ---------------------- BUFFER POOL AND MEMORY ---------------------- Total memory allocated 302986592; in additional pool allocated 994176 Buffer pool size 16384 Free buffers 6515 Database pages 9604 Modified db pages 0 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages read 9561, created 43, written 97124 0.06 reads/s, 0.00 creates/s, 4.00 writes/s Buffer pool hit rate 1000 / 1000 -------------- ROW OPERATIONS -------------- 0 queries inside InnoDB, 0 queries in queue Main thread process no. 31402, id 7176, state: sleeping Number of rows inserted 320, updated 63203, deleted 6, read 22844887 0.06 inserts/s, 1.69 updates/s, 0.00 deletes/s, 188.74 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================ hit rate 1000 / 1000 -------------- ROW OPERATIONS -------------- 0 queries inside InnoDB, 0 queries in queue Main thread process no. 31402, id 7176, state: sleeping Number of rows inserted 318, updated 63155, deleted 6, read 22838534 0.00 inserts/s, 1.37 updates/s, 0.00 deletes/s, 136.80 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================ ----- Total memory allocated 302986592; in additional pool allocated 997248 Buffer pool size 16384 Free buffers 6526 Database pages 9593 Modified db pages 10 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages read 9550, created 43, written 96690 0.00 reads/s, 0.00 creates/s, 2.62 writes/s Buffer pool hit rate 1000 / 1000 -------------- ROW OPERATIONS -------------- 0 queries inside InnoDB, 0 queries in queue Main thread process no. 31402, id 7176, state: sleeping Number of rows inserted 317, updated 62993, deleted 6, read 22815382 0.00 inserts/s, 1.25 updates/s, 0.00 deletes/s, 125.62 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================ -hash searches/s --- LOG --- Log sequence number 71 307459125 Log flushed up to 71 307450235 Last checkpoint at 71 307441693 0 pending log writes, 0 pending chkp writes 20696 log i/o's done, 1.06 log i/o's/second ---------------------- BUFFER POOL AND MEMORY ---------------------- Total memory allocated 302986592; in additional pool allocated 1021824 Buffer pool size 16384 Free buffers 6602 Database pages 9517 Modified db pages 10 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages read 9474, created 43, written 94728 0.00 reads/s, 0.00 creates/s, 3.50 writes/s Buffer pool hit rate 1000 / 1000 -------------- ROW OPERATIONS -------------- 0 queries inside InnoDB, 0 queries in queue Main thread process no. 31402, id 7176, state: sleeping Number of rows inserted 312, updated 62001, deleted 6, read 22269366 0.00 inserts/s, 1.62 updates/s, 0.00 deletes/s, 7175.74 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================ te thread) Pending normal aio reads: 0, aio writes: 0, ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0 Pending flushes (fsync) log: 0; buffer pool: 0 3618 OS file reads, 5532 OS file writes, 1391 OS fsyncs 0.94 reads/s, 16384 avg bytes/read, 4.31 writes/s, 1.44 fsyncs/s ------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf for space 0: size 1, free list len 0, seg size 2, is empty Ibuf for space 0: size 1, free list len 0, seg size 2, 10 inserts, 10 merged recs, 10 merges Hash table size 1106407, used cells 178654, node heap has 192 buffer(s) 266.73 hash searches/s, 41.18 non-hash searches/s --- LOG --- Log sequence number 71 92506837 Log flushed up to 71 92502290 Last checkpoint at 71 92383088 0 pending log writes, 0 pending chkp writes 1057 log i/o's done, 1.19 log i/o's/second ---------------------- BUFFER POOL AND MEMORY ---------------------- Total memory allocated 302986592; in additional pool allocated 649216 Buffer pool size 16384 Free buffers 12090 Database pages 4102 Modified db pages 49 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages read 4101, created 1, written 4537 0.94 reads/s, 0.00 creates/s, 3.06 writes/s Buffer pool hit rate 999 / 1000 -------------- ROW OPERATIONS -------------- 0 queries inside InnoDB, 0 queries in queue Main thread process no. 31402, id 7176, state: sleeping Number of rows inserted 18, updated 3047, deleted 0, read 834837 0.06 inserts/s, 3.31 updates/s, 0.00 deletes/s, 293.54 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================ ---------------------------- END OF INNODB MONITOR OUTPUT ============================ <crash> So it seems something is definitively wrong during the write of the innodb.status.<pid> file. Regards, Jocelyn
[3 May 2004 12:56]
jocelyn fournier
Hi, In fact, I notice the file is not properly removed/cleaned up before innodb trys to write on it. (I initialy was thinking that innodb was appending the data to the file each 15 seconds, which not seems to be the case). Regards, Jocelyn
[3 May 2004 15:18]
Marko Mäkelä
I should have mentioned that the file will contain only one copy of the status output, and it won't be truncated to its actual length when the write completes. If there exist many transactions at some point, the status output will become larger. When the transactions complete, the status output will shrink, and only the initial part of the old status output will be overwritten. You should ignore anything after the first "END OF INNODB MONITOR OUTPUT".
[5 May 2004 10:58]
Marko Mäkelä
ChangeSet@1.1804 in the 4.0 tree will truncate the innodb.status.<pid> file to its actual size. Eventually it will be merged to the 4.1 tree. However, it should not affect the crash you have experienced. I still need a repeatable test case for that.
[5 May 2004 20:07]
jocelyn fournier
Hi, I just want to let you the crash has just happened once again, with a fresh bk tree just built. Unfortunately I still not have any testcase to give to you since I've no clue on why this occurs, and the server could run without any crash for a long amount of time :( Regards, Jocelyn
[11 May 2004 23:00]
jocelyn fournier
Hi, According to what I can see in the log, MySQL seems to fail to write either thd->proc_info or thd->query, and thus putc('\n', f) crashes mysql since it tries to write at EOF. Indeed, in several logs, just after the crash, I can see : ---TRANSACTION 0 603701464, not started, process no 19490, OS thread id 41926730 MySQL thread id 1424189, query id 53191340 localhost mysql ---TRANSACTION 0 0, not started, process no 29145, OS thread id 29032651 ---TRANSACTION 0 618036723, not started, process no 12202, OS thread id 56836129 MySQL thread id 888011, query id 33244972 localhost mysql , not started, process no 12421, OS thread id 59262126 ---TRANSACTION 0 619551739, not started, process no 18957, OS thread id 688171 MySQL thread id 60017, query id 2419874 localhost mysql SACTION 0 619550161, not started, process no 18960, OS thread id 737326 you can see thd->user is written properly, as well as the next putc(' ',f), but the next fputs() seems to not has been written properly and thus, latest putc definitively crashes mysql because of fputs write failure. Note I'm using delayed thread, I don't know if it could have any impact. I can also modify ha_innodb.cc to be able to provide you more infos about this crash, just tell what you need. Thanks and regards, Jocelyn
[12 May 2004 15:29]
Marko Mäkelä
Thank you for your bug report! The crashes are probably caused by race conditions: (1) some other thread may be setting thd->proc_info or thd->query to NULL between the test and call to fputs() (2) the memory area pointed to by thd->query can be freed between the test and the completion of fputs() There are three possibilities why this bug hasn't surfaced before. First, writing to a file is slower than copying memory areas and may introduce context switches. Thus, the "window of opportunity" where the error can occur is bigger. Second, the function used to print at most 150 first characters of the query. Now, everything until the terminating NUL char will be printed, and a memory protection violation is more likely to occur. Third, the SHOW INNODB STATUS will be invoked every 15 seconds by a monitoring thread inside InnoDB. Earlier, it used to be invoked by external clients only. The solution is to assign thd->proc_info or thd->query to a local char* variable and test it against NULL. We will need to protect thd->query with a mutex, and maybe also introduce an upper limit of number of characters to print. The bug should manifest itself in 4.0.19 as well.
[13 May 2004 23:19]
Min Xu
I am interested in helping reproduce and confirm the race condition. I have a automatic tool can be used with MySQL to locate race conditions that manifest during the execution. Could the bug submitter tell me a little bit about the setup? What queries should I run in order to triger this crash? What is the configure of the database? Thanks a lot!
[14 May 2004 8:59]
Marko Mäkelä
You should be able to trigger the race condition by running quick queries in one connection and invoking SHOW INNODB STATUS (or SHOW PROCESSLIST) in another connection.
[14 May 2004 13:50]
Heikki Tuuri
Hi! The bug is mostly fixed in the latest 4.0.20 source tree. I am still studying how to prevent less common races. Regards, Heikki
[14 May 2004 18:47]
Min Xu
Thanks! I will bk and latest tree.
[15 May 2004 12:36]
Heikki Tuuri
Hi! We have now pushed to the 4.0 source tree some additional fixes that should remove also races associated with an obsolete value of thd->query_length. These races could have potentially crashed both SHOW PROCESSLIST and SHOW INNODB STATUS. It will take some time until these patches are merged to 4.1. Regards, Heikki
[6 Nov 2008 11:53]
Marko Mäkelä
This has resurfaced as Bug #38883.
[6 Nov 2008 12:12]
Kristofer Pettersson
Is it possible to track the changes made to the code in this forum? Can you paste a diff to the bug report maybe?
[6 Nov 2008 13:40]
Marko Mäkelä
As far as I remember, this bug was fixed so that InnoDB acquires LOCK_thread_count before printing thd->query of some other thd than current_thd. There is a comment in sql/sql_class.h for thd->query: Note that (A) if we set query = NULL, we must at the same time set query_length = 0, and protect the whole operation with the LOCK_thread_count mutex. And (B) we are ONLY allowed to set query to a non-NULL value if its previous value is NULL. We do not need to protect operation (B) with any mutex. To avoid crashes in races, if we do not know that thd->query cannot change at the moment, one should print thd->query like this: (1) reserve the LOCK_thread_count mutex; (2) check if thd->query is NULL; (3) if not NULL, then print at most thd->query_length characters from it. We will see the query_length field as either 0, or the right value for it. Assuming that the write and read of an n-bit memory field in an n-bit computer is atomic, we can avoid races in the above way. This printing is needed at least in SHOW PROCESSLIST and SHOW INNODB STATUS. The most likely explanation of Bug #38883 is that some MySQL code is setting thd->query=NULL without holding the LOCK_thread_count mutex.