Bug #40155 | [Falcon] Error: assertion (ret == 0) failed at line 80 in file Mutex.cpp | ||
---|---|---|---|
Submitted: | 19 Oct 2008 19:28 | Modified: | 15 May 2009 17:04 |
Reporter: | Sven Sandberg | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S2 (Serious) |
Version: | 6.0-rpl | OS: | Any |
Assigned to: | Olav Sandstå | CPU Architecture: | Any |
Tags: | assertion, disk full, F_ERROR HANDLING, Memory, pushbuild, sporadic, test failure, widespread |
[19 Oct 2008 19:28]
Sven Sandberg
[19 Oct 2008 19:28]
Sven Sandberg
See also BUG#40133, BUG#40114, BUG#40109, BUG#38970, and BUG#37779.
[19 Oct 2008 19:37]
Sven Sandberg
The crashes happened in only two different pushes. Both those pushes it seems that the host's disk was nearly full. So it's probably related to that.
[20 Oct 2008 15:26]
Kevin Lewis
Olav, Assigning this to you for invetigation since you are currently working on what happens when a disk full error occurs.
[27 Oct 2008 20:51]
Olav Sandstå
It also seems that the server running this test was "out of memory" at the time this failure happened (taken from /var/adm/messages on sol10-amd64-a): Oct 4 02:47:31 sol10-amd64-a tmpfs: [ID 518458 kern.warning] WARNING: /tmp: Fil e system full, swap space limit exceeded Oct 4 02:52:49 sol10-amd64-a last message repeated 5556 times (if I have managed to correlate time stamps correctly). The assert that is hit is in Mutex::lock(): int ret = pthread_mutex_lock (&mutex); ASSERT(ret == 0); Given that we have no core file and no call stack I expect it may be hard to reproduce this exact crash. Possible things that can be done: 1. Instrument the code so that it writes something to stderr when this happens again (but no guarantee that this will ever be written if the system is out of memory) 2. Start testing Falcon on a system where we "simulates" out-of-memory by filling up the swap space - and see what happens.
[12 Feb 2009 12:30]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/66031 3014 Olav Sandstaa 2009-02-12 Bug #40155 [Falcon] Error: assertion (ret == 0) failed at line 80 in file Mutex.cpp Adding code for writing the error status from the call to pthread_mutex_lock to the mysql log file when this error occurs. This does not fix the problem but will help understanding why this assert happens.
[12 Feb 2009 16:51]
Kevin Lewis
We cannot do anything about machines with not enough memory. I think we should fix this bug with this patch.
[16 Feb 2009 9:50]
Olav Sandstå
With the extra information provided by the latest patch the output when this crash occurs is: [Falcon] Error: Mutex::lock: pthread_mutex_lock returned errno 45 [Falcon] Error: assertion (ret == 0) failed at line 92 in file Mutex.cpp On Solaris error number 45 is: #define EDEADLK 45 /* Deadlock condition. */ and the man page for pthread_mutex_lock says: "The pthread_mutex_lock() function may fail if: EDEADLK The current thread already owns the mutex." The question then is: Does Falcon actually try to lock the mutex twice or is this a consequence of an "out-of-error situation"?
[23 Feb 2009 22:43]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/67267 3033 Olav Sandstaa 2009-02-23 Bug #40155 Error: assertion (ret == 0) failed at line 80 in file Mutex.cpp Add a location name to every Mutex object. The purpose is to be able to know which mutex an error occurs on.
[24 Feb 2009 14:04]
Philip Stoev
Here is a backtrace for this assertion obtained with the RQG. #7 0x00000000009774c7 in Error::assertionFailed (text=0xe4ec52 "ret == 0", fileName=0xe4ec48 "Mutex.cpp", line=92) at Error.cpp:78 #8 0x000000000097bb8f in Mutex::lock (this=0x2aaabc454d70) at Mutex.cpp:92 #9 0x000000000099102a in SyncObject::grantLocks (this=0x2aaabc454d60) at SyncObject.cpp:839 #10 0x0000000000991499 in SyncObject::unlock (this=0x2aaabc454d60, sync=0x41a6b2f0, type=Exclusive) at SyncObject.cpp:536 #11 0x00000000009907c5 in ~Sync (this=0x41a6b2f0) at Sync.cpp:51 #12 0x00000000009aaaf2 in Transaction::commitNoUpdates (this=0x2aaabc454bf8) at Transaction.cpp:353 #13 0x00000000009aac5f in Transaction::commit (this=0x2aaabc454bf8) at Transaction.cpp:212 #14 0x00000000009c4446 in Connection::commit (this=0x2aaaaae37378) at Connection.cpp:272 #15 0x000000000097ec85 in StorageConnection::commit (this=0x2aaaaae372e0) at StorageConnection.cpp:150 #16 0x000000000096f191 in StorageInterface::commit (hton=0x15d5c620, thd=0x16aac9b0, all=true) at ha_falcon.cpp:1258 #17 0x000000000081a59b in ha_commit_one_phase (thd=0x16aac9b0, all=true) at handler.cc:1223 #18 0x000000000081aaf2 in ha_commit_trans (thd=0x16aac9b0, all=true) at handler.cc:1191 #19 0x00000000008ebe34 in trans_commit (thd=0x16aac9b0) at transaction.cc:132 #20 0x00000000006d48a8 in mysql_execute_command (thd=0x16aac9b0) at sql_parse.cc:4035 #21 0x00000000006d6af9 in mysql_parse (thd=0x16aac9b0, inBuf=0x16ac7e38 "COMMIT", length=6, found_semicolon=0x41a6cf20) at sql_parse.cc:5752 #22 0x00000000006d7c8b in dispatch_command (command=COM_QUERY, thd=0x16aac9b0, packet=0x16ab7cb1 " COMMIT", packet_length=7) at sql_parse.cc:1009 #23 0x00000000006d911e in do_command (thd=0x16aac9b0) at sql_parse.cc:691 #24 0x00000000006c65da in handle_one_connection (arg=0x16aac9b0)
[24 Feb 2009 14:17]
Philip Stoev
The previous callstack was accompanied by the following message in the error log: [Falcon] Error: Mutex::lock: pthread_mutex_lock returned errno 22
[2 Mar 2009 14:12]
Bugs System
Pushed into 6.0.11-alpha (revid:alik@sun.com-20090302140208-lfdejjbcyezlhhjt) (version source revid:olav@sun.com-20090223224236-skt6frq2ad6ub61g) (merge vers: 6.0.11-alpha) (pib:6)
[2 Mar 2009 14:53]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/68000 3043 Olav Sandstaa 2009-03-02 Fix to Bug #40155 Error: assertion (ret == 0) failed at line 80 in file Mutex.cpp This crash was due to the Memory manager trying to lock the same mutex twice. It happened when the server run out of memory. The memory manager handles this by throwing an exception. Unfortunatly, when creating this exception there there was call both the Log and the JString class that did allocate memory leading to the second call to the memory manager's mutex. This patch fixes this by: -SQLError exception do not longer call the Log if the error code is out-of-memory -SQLError use an new method on JString that does not allocate memory for the text but just stores the pointer to the existing string In addition the memory manager now writes a message directly to stderr when it runs out of memory.
[2 Mar 2009 17:44]
Kevin Lewis
Patch looks OK. But I agree with Vlad that the bool staticString should be isStaticString or isStatic or isReadonly or isConst.
[5 Mar 2009 13:11]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/68376 3051 Olav Sandstaa 2009-03-05 Fix to Bug #40155 Error: assertion (ret == 0) failed at line 80 in file Mutex.cpp This crash was due to the Memory manager trying to lock the same mutex twice. It happened when the server run out of memory. The memory manager handles this by throwing an exception. Unfortunatly, when creating this exception there there was call both the JString and the Log class that did allocate memory leading to the second call to the memory manager's mutex. This patch fixes this by: -SQLError use an new method on JString that does not allocate memory for the text but just stores the pointer to the existing string -Simplified SQLError's use of the Log if the error is out-of-memory @ storage/falcon/JString.cpp Support for using an external string instead of copying the string to an internal buffer. This is done to avoid having to allocate space in the JString object in situations where Falcon have run out of memory. @ storage/falcon/JString.h Add a method for setting the JSTring's string to use an externally allocated string @ storage/falcon/SQLError.cpp Special handle the OUT_OF_MEMORY_ERROR code so that this is handled without allocating memory. @ storage/falcon/SQLError.h Add the sql error code as a paramter to the error method.
[2 Apr 2009 17:38]
Bugs System
Pushed into 6.0.11-alpha (revid:hky@sun.com-20090402144811-yc5kp8g0rjnhz7vy) (version source revid:hky@sun.com-20090305143825-td8ysuv8g78v06qs) (merge vers: 6.0.11-alpha) (pib:6)
[15 May 2009 17:04]
MC Brown
A note has been added to the 6.0.11 changelog: Falcon could cause an assertion when the system has run out of memory and tries to report the memory allocation failure.