Bug #32730 Concurrent TRUNCATEs crash Falcon engine
Submitted: 26 Nov 2007 18:34 Modified: 5 May 2008 16:41
Reporter: Christopher Powers Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version: OS:Any
Assigned to: Christopher Powers CPU Architecture:Any

[26 Nov 2007 18:34] Christopher Powers
Description:
Running concurrent TRUNCATE operations crashes Falcon. 

How to repeat:
The quickest way to reproduce the failure is with falcon_bug_22173a.test.

This testcase establishes two connections, then runs a stored procedure from each client that continually inserts 50 rows into a file, then truncates the same file. 

One of several critical failures will occur within seconds:
- Record::~Record()   - ASSERT(!active)
- RecordLeaf::store() - COMPARE_EXCHANGE_POINTER() fails
- Cache::fetchPage()  - ASSERT (pageNumber >= 0)
[28 Nov 2007 19:08] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/38735

ChangeSet@1.2706, 2007-11-28 13:08:21-06:00, chris@xeno.mysql.com +6 -0
  Bug#32730 - Concurrent TRUNCATEs crash Falcon engine
  - Added TRUNCATE sync object to StorageTable
  - Fixed root page bug in Index::rebuildIndexes
[28 Nov 2007 19:46] Christopher Powers
I tightened the synchronization and fixed a bug in Index::rebuildIndex(), but the issue still remains. More to follow...
[29 Nov 2007 5:01] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/38782

ChangeSet@1.2714, 2007-11-28 23:00:53-06:00, chris@xeno.mysql.com +1 -0
  Bug#32730, Concurrent TRUNCATEs crash Falcon engine
  - Explicitly unlock the StorageTable truncate sync object in the destructor, and when deleting a table.
[29 Nov 2007 5:14] Christopher Powers
Clear truncate lock in StorageTable destructor and when deleting a table.
[29 Nov 2007 19:25] Kevin Lewis
The previous change-sets for this bug helped this concurrent truncate test somewhat, but a small window still remains in which problems occur.  

However, those change-sets also introduced many regressions on pushbuild running debug engines.  Here is the explanation of this problem;

Jim Starkey wrote;

Great care must be taken when including Falcon headers directly or 
indirectly in ha_falcon.cpp.  The fundamental problem massive symbol 
clash between Falcon headers and MySQL headers, which isn't surprising 
because each is a SQL database engine. 

To get around the problem, there are three types of Falcon modules:

   1. Those that include mysql_priv.h (ha_falcon.cpp, InfoTable.cpp, and
      ScaledBinary.cpp)
   2. Those whose header files are included in modules in #1
      (StorageXXX.cpp)
   3. The rest of Falcon (all include Engine.h)

The danger is when one of the DMZ (#2 above) header files includes a 
Falcon header file.  The Falcon header file may have conditionals and 
types that compile to different lengths when compiled in different 
environments, resulting in very obscure failure modes.  For example, a 
couple of days ago, Chris Powers innocently put a SyncObject into 
StorageTable (at my direction, I might add) to manage concurrency issues 
between ordinary table access and "truncate table".  This required 
putting in include of SyncObject.h into StorageTable.h, which was, in 
turn, included by ha_falcon.cpp.  It compiled OK, but a SyncObject in 
one context turned out to be a different length than the other.  So the 
code worked fine in Visual Studio and gcc compiled for production, but 
died a horrible death compiled with gcc for debugging.

The solution in this case was the change the SyncObject required for 
truncate from to a pointer to SyncObject allocated by the StorageTable 
constructor and delete by the destructor.  This kept the potentially 
ambiguous class SyncObject out of ha_falcon.cpp.
[30 Nov 2007 20:43] Bugs System
Pushed into 6.0.4-alpha
[1 Dec 2007 21:11] Jim Starkey
Maintain truncate lock in table share to support "truncate table"
coexistence with transactions.
[2 Dec 2007 18:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/39073

ChangeSet@1.2719, 2007-12-02 12:40:58-06:00, chris@xeno.mysql.com +1 -0
  Bug#32730, Concurrent TRUNCATEs crash Falcon engine
  - Testcase 22173a triggered assert in SyncObject::unlock() because the lockstate was 0. Truncate
  locks are always Shared, so StorageTableShare::clearTruncateLock() now explicitly indicates the lock
  type to avoid the assert.
[2 Dec 2007 18:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/39074

ChangeSet@1.2720, 2007-12-02 12:47:46-06:00, chris@xeno.mysql.com +1 -0
  Bug#32730, Concurrent TRUNCATEs crash Falcon engine
  - Enabled falcon_bug_22173a
[2 Dec 2007 21:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/39078

ChangeSet@1.2720, 2007-12-02 15:11:59-06:00, chris@xeno.mysql.com +3 -0
  Bug#32730, Concurrent TRUNCATEs crash Falcon engine
  - Enabled falcon_bug_22173a
  ---
  Bug#32730, Concurrent TRUNCATEs crash Falcon engine
  - Added interlocked increment/decrement to keep StorageTableShare::truncateLockCount in sync
  with the state of syncTruncateLock. The lock count became out of sync during testcase falcon_bug_22173a
  run on a compile-amd64-max build.
  - Enabled testcase falcon_bug_22173a
[3 Dec 2007 10:02] Bugs System
Pushed into 6.0.4-alpha
[21 Feb 2008 16:12] Kevin Lewis
Patch is in mysql-6.0-release version 6.0.4
[5 May 2008 16:41] Paul DuBois
Noted in 6.0.4 changelog.

Concurrent TRUNCATE TABLE operations for Falcon tables caused Falcon
to crash.