Bug #3596 crash in innodb with mysql 4.1.2
Submitted: 29 Apr 2004 1:55 Modified: 15 May 2004 12:36
Reporter: jocelyn fournier (Silver Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:4.0.19, 4.1.2 latest bk tree OS:Linux (Linux)
Assigned to: Heikki Tuuri CPU Architecture:Any

[29 Apr 2004 1:55] jocelyn fournier
Description:
Hi,

I've just noticed with the latest bk tree than MySQL regularly crashes in InnoDB code with the following stack trace :

0x80c2307 handle_segfault + 647
0x82f57b8 pthread_sighandler + 176
0x83129a9 fputs + 25
0x8146a07 innobase_mysql_print_thd + 199
0x8261bab lock_print_info + 1307
0x81846a4 srv_printf_innodb_monitor + 308
0x8184d05 srv_lock_timeout_and_monitor_thread + 597
0x82f31da pthread_start_thread + 218
0x832351a thread_start + 4

Regards,
  Jocelyn

How to repeat:
I've still no clues on why this crash occurs.
[3 May 2004 9:13] Marko Mäkelä
Sorry, I couldn't repeat the problem with the current 4.1 tree. Could you please try it with the latest sources and include a test case for crashing the server?
[3 May 2004 10:26] jocelyn fournier
Hi,

I'm currenlty running mysql with the latest bk version available.
If I encounter once again the crash I will try to investage more thoroughly the problem, also it doesn't to be linked to any specific query.
BTW I noticed the apparition in my mysql directory of several innodb.status.a_number files I've never seen before, and which seems to have appear since the same time I noticed the crash.
What are those files, and could this be linked to my crash ?

Thanks,
  Jocelyn
[3 May 2004 10:51] Marko Mäkelä
I recently made all InnoDB diagnostic output to use a temporary file rather than string buffers.  When MySQL starts, InnoDB creates a file innodb.status.<pid>. When it is shut down normally, InnoDB removes the file. The file should contain the same information as SHOW INNODB STATUS, and it should be updated every 15 seconds. You can safely remove the files, or maybe have a look at them first, in case there is some hint on what is causing the crashes.
[3 May 2004 12:45] jocelyn fournier
Hi,

I've just taken a look in the innodb.status files, and the data loggued are quite strange.

It's seems at each update of the log file, the beginning of the log is more and more truncated :

Here is some snips of the log file :

<beginning of the file>
=====================================
040429  1:24:17 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 16 seconds
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 5805, signal count 5068
Mutex spin waits 47160, rounds 199865, OS waits 3026
RW-shared spins 3892, OS waits 1876; RW-excl spins 688, OS waits 526
------------
TRANSACTIONS
------------
Trx id counter 0 578243671
Purge done for trx's n:o < 0 578243587 undo n:o < 0 0
Total number of lock structs in row lock hash table 0
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0 578243670, not started, process no 31431, OS thread id 36901
MySQL thread id 336051, query id 12118543 localhost mysql ACTION 0 578241032, not started, process no 31466, OS thread id 72776
MySQL thread id 334765, query id 12110969 localhost mysql
--------
FILE I/O
--------
I/O thread 0 state: waiting for i/o request (insert buffer thread)
I/O thread 1 state: waiting for i/o request (log thread)
I/O thread 2 state: waiting for i/o request (read thread)
I/O thread 3 state: waiting for i/o request (write thread)
Pending normal aio reads: 0, aio writes: 0,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
8704 OS file reads, 116195 OS file writes, 28758 OS fsyncs
0.06 reads/s, 16384 avg bytes/read, 2.12 writes/s, 1.06 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf for space 0: size 1, free list len 0, seg size 2, is empty
Ibuf for space 0: size 1, free list len 0, seg size 2,
105 inserts, 105 merged recs, 105 merges
Hash table size 1106407, used cells 238474, node heap has 265 buffer(s)
151.12 hash searches/s, 10.94 non-hash searches/s
---
LOG
---
Log sequence number 71 311847382
Log flushed up to   71 311847382
Last checkpoint at  71 311811106
0 pending log writes, 0 pending chkp writes
21582 log i/o's done, 0.81 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 302986592; in additional pool allocated 992640
Buffer pool size   16384
Free buffers       6514
Database pages     9605
Modified db pages  19
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages read 9562, created 43, written 97144
0.06 reads/s, 0.00 creates/s, 1.25 writes/s
Buffer pool hit rate 1000 / 1000
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
Main thread process no. 31402, id 7176, state: sleeping
Number of rows inserted 320, updated 63223, deleted 6, read 22847430
0.00 inserts/s, 1.25 updates/s, 0.00 deletes/s, 158.93 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
writes
21569 log i/o's done, 1.06 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 302986592; in additional pool allocated 994176
Buffer pool size   16384
Free buffers       6515
Database pages     9604
Modified db pages  0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages read 9561, created 43, written 97124
0.06 reads/s, 0.00 creates/s, 4.00 writes/s
Buffer pool hit rate 1000 / 1000
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
Main thread process no. 31402, id 7176, state: sleeping
Number of rows inserted 320, updated 63203, deleted 6, read 22844887
0.06 inserts/s, 1.69 updates/s, 0.00 deletes/s, 188.74 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
hit rate 1000 / 1000
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
Main thread process no. 31402, id 7176, state: sleeping
Number of rows inserted 318, updated 63155, deleted 6, read 22838534
0.00 inserts/s, 1.37 updates/s, 0.00 deletes/s, 136.80 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
-----
Total memory allocated 302986592; in additional pool allocated 997248
Buffer pool size   16384
Free buffers       6526
Database pages     9593
Modified db pages  10
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages read 9550, created 43, written 96690
0.00 reads/s, 0.00 creates/s, 2.62 writes/s
Buffer pool hit rate 1000 / 1000
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
Main thread process no. 31402, id 7176, state: sleeping
Number of rows inserted 317, updated 62993, deleted 6, read 22815382
0.00 inserts/s, 1.25 updates/s, 0.00 deletes/s, 125.62 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
-hash searches/s
---
LOG
---
Log sequence number 71 307459125
Log flushed up to   71 307450235
Last checkpoint at  71 307441693
0 pending log writes, 0 pending chkp writes
20696 log i/o's done, 1.06 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 302986592; in additional pool allocated 1021824
Buffer pool size   16384
Free buffers       6602
Database pages     9517
Modified db pages  10
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages read 9474, created 43, written 94728
0.00 reads/s, 0.00 creates/s, 3.50 writes/s
Buffer pool hit rate 1000 / 1000
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
Main thread process no. 31402, id 7176, state: sleeping
Number of rows inserted 312, updated 62001, deleted 6, read 22269366
0.00 inserts/s, 1.62 updates/s, 0.00 deletes/s, 7175.74 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
te thread)
Pending normal aio reads: 0, aio writes: 0,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
3618 OS file reads, 5532 OS file writes, 1391 OS fsyncs
0.94 reads/s, 16384 avg bytes/read, 4.31 writes/s, 1.44 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf for space 0: size 1, free list len 0, seg size 2, is empty
Ibuf for space 0: size 1, free list len 0, seg size 2,
10 inserts, 10 merged recs, 10 merges
Hash table size 1106407, used cells 178654, node heap has 192 buffer(s)
266.73 hash searches/s, 41.18 non-hash searches/s
---
LOG
---
Log sequence number 71 92506837
Log flushed up to   71 92502290
Last checkpoint at  71 92383088
0 pending log writes, 0 pending chkp writes
1057 log i/o's done, 1.19 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 302986592; in additional pool allocated 649216
Buffer pool size   16384
Free buffers       12090
Database pages     4102
Modified db pages  49
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages read 4101, created 1, written 4537
0.94 reads/s, 0.00 creates/s, 3.06 writes/s
Buffer pool hit rate 999 / 1000
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
Main thread process no. 31402, id 7176, state: sleeping
Number of rows inserted 18, updated 3047, deleted 0, read 834837
0.06 inserts/s, 3.31 updates/s, 0.00 deletes/s, 293.54 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
----------------------------
END OF INNODB MONITOR OUTPUT
============================

<crash>

So it seems something is definitively wrong during the write of the innodb.status.<pid> file.

Regards,
  Jocelyn
[3 May 2004 12:56] jocelyn fournier
Hi,

In fact, I notice the file is not properly removed/cleaned up before innodb trys to write on it. (I initialy was thinking that innodb was appending the data to the file each 15 seconds, which not seems to be the case).

Regards,
  Jocelyn
[3 May 2004 15:18] Marko Mäkelä
I should have mentioned that the file will contain only one copy of the status output, and it won't be truncated to its actual length when the write completes.  If there exist many transactions at some point, the status output will become larger.  When the transactions complete, the status output will shrink, and only the initial part of the old status output will be overwritten.  You should ignore anything after the first "END OF INNODB MONITOR OUTPUT".
[5 May 2004 10:58] Marko Mäkelä
ChangeSet@1.1804 in the 4.0 tree will truncate the innodb.status.<pid> file to its actual size. Eventually it will be merged to the 4.1 tree. However, it should not affect the crash you have experienced. I still need a repeatable test case for that.
[5 May 2004 20:07] jocelyn fournier
Hi,

I just want to let you the crash has just happened once again, with a fresh bk tree just built.
Unfortunately I still not have any testcase to give to you since I've no clue on why this occurs, and the server could run without any crash for a long amount of time :(

Regards,
  Jocelyn
[11 May 2004 23:00] jocelyn fournier
Hi,

According to what I can see in the log, MySQL seems to fail to write either thd->proc_info or thd->query, and thus putc('\n', f) crashes mysql since it tries to write at EOF.

Indeed, in several logs, just after the crash, I can see :

---TRANSACTION 0 603701464, not started, process no 19490, OS thread id 41926730
MySQL thread id 1424189, query id 53191340 localhost mysql ---TRANSACTION 0 0, not started, process no 29145, OS thread id 29032651

---TRANSACTION 0 618036723, not started, process no 12202, OS thread id 56836129
MySQL thread id 888011, query id 33244972 localhost mysql , not started, process no 12421, OS thread id 59262126

---TRANSACTION 0 619551739, not started, process no 18957, OS thread id 688171
MySQL thread id 60017, query id 2419874 localhost mysql SACTION 0 619550161, not started, process no 18960, OS thread id 737326

you can see thd->user is written properly, as well as the next putc(' ',f), but the next fputs() seems to not has been written properly and thus, latest putc definitively crashes mysql because of fputs write failure.

Note I'm using delayed thread, I don't know if it could have any impact.
I can also modify ha_innodb.cc to be able to provide you more infos about this crash, just tell what you need.

Thanks and regards,
  Jocelyn
[12 May 2004 15:29] Marko Mäkelä
Thank you for your bug report!

The crashes are probably caused by race conditions:
(1) some other thread may be setting thd->proc_info or thd->query to NULL between the test and call to fputs()
(2) the memory area pointed to by thd->query can be freed between the test and the completion of fputs()

There are three possibilities why this bug hasn't surfaced before.  First, writing to a file is slower than copying memory areas and may introduce context switches.  Thus, the "window of opportunity" where the error can occur is bigger.  Second, the function used to print at most 150 first characters of the query.  Now, everything until the terminating NUL char will be printed, and a memory protection violation is more likely to occur.  Third, the SHOW INNODB STATUS will be invoked every 15 seconds by a monitoring thread inside InnoDB.  Earlier, it used to be invoked by external clients only.

The solution is to assign thd->proc_info or thd->query to a local char* variable and test it against NULL.  We will need to protect thd->query with a mutex, and maybe also introduce an upper limit of number of characters to print.

The bug should manifest itself in 4.0.19 as well.
[13 May 2004 23:19] Min Xu
I am interested in helping reproduce and confirm the race condition. I have
a automatic tool can be used with MySQL to locate race conditions that manifest
during the execution.

Could the bug submitter tell me a little bit about the setup? What queries should
I run in order to triger this crash? What is the configure of the database?

Thanks a lot!
[14 May 2004 8:59] Marko Mäkelä
You should be able to trigger the race condition by running quick queries in one connection and invoking SHOW INNODB STATUS (or SHOW PROCESSLIST) in another connection.
[14 May 2004 13:50] Heikki Tuuri
Hi!

The bug is mostly fixed in the latest 4.0.20 source tree. I am still studying how to prevent less common races.

Regards,

Heikki
[14 May 2004 18:47] Min Xu
Thanks! I will bk and latest tree.
[15 May 2004 12:36] Heikki Tuuri
Hi!

We have now pushed to the 4.0 source tree some additional fixes that should remove also races associated with an obsolete value of thd->query_length. These races could have potentially crashed both SHOW PROCESSLIST and SHOW INNODB STATUS.

It will take some time until these patches are merged to 4.1.

Regards,

Heikki
[6 Nov 2008 11:53] Marko Mäkelä
This has resurfaced as Bug #38883.
[6 Nov 2008 12:12] Kristofer Pettersson
Is it possible to track the changes made to the code in this forum? Can you paste a diff to the bug report maybe?
[6 Nov 2008 13:40] Marko Mäkelä
As far as I remember, this bug was fixed so that InnoDB acquires LOCK_thread_count before printing thd->query of some other thd than current_thd. There is a comment in sql/sql_class.h for thd->query:

    Note that (A) if we set query = NULL, we must at the same time set
    query_length = 0, and protect the whole operation with the
    LOCK_thread_count mutex. And (B) we are ONLY allowed to set query to a
    non-NULL value if its previous value is NULL. We do not need to protect
    operation (B) with any mutex. To avoid crashes in races, if we do not
    know that thd->query cannot change at the moment, one should print
    thd->query like this:
      (1) reserve the LOCK_thread_count mutex;
      (2) check if thd->query is NULL;
      (3) if not NULL, then print at most thd->query_length characters from
      it. We will see the query_length field as either 0, or the right value
      for it.
    Assuming that the write and read of an n-bit memory field in an n-bit
    computer is atomic, we can avoid races in the above way. 
    This printing is needed at least in SHOW PROCESSLIST and SHOW INNODB
    STATUS.

The most likely explanation of Bug #38883 is that some MySQL code is setting thd->query=NULL without holding the LOCK_thread_count mutex.