Bug #35982 Falcon crashes on concurrent load data infile
Submitted: 11 Apr 2008 0:31 Modified: 30 Sep 2008 19:19
Reporter: Vladislav Vaintroub Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:mysql-6.0-falcon-team OS:Any
Assigned to: Christopher Powers
Triage: D1 (Critical) / R3 (Medium) / E2 (Low)

[11 Apr 2008 0:31] Vladislav Vaintroub
Description:
mysqld crashed ,while I was running concurrent "load data infile" test
(3 mysql clients each loading 380MB file)

I use newest mysql-6.0-falcon-team

the callstack from error log:
000000014042EC40    mysqld.exe!Table::getSyncPrior()[table.cpp:3754]
0000000140433535    mysqld.exe!Table::garbageCollect()[table.cpp:2052]
0000000140434A58    mysqld.exe!`Table::insert'::`1'::catch$0()[table.cpp:3019]
00000001405C50D0    mysqld.exe!_CallSettingFrame()[handlers.asm:36]
00000001404EC2AE    mysqld.exe!__CxxCallCatchBlock()[frame.cpp:1342]
00000000778454A1    ntdll.dll!RtlRestoreContext()
000000014043491E    mysqld.exe!Table::insert()[table.cpp:2985]
00000001404189AB    mysqld.exe!StorageTable::insert()[storagetable.cpp:122]
0000000140417F67    mysqld.exe!StorageInterface::write_row()[ha_falcon.cpp:1059]

the function that crashes is Table::getSyncPrior
and the line is
-->int lockNumber = record->recordNumber % SYNC_VERSIONS_SIZE; 

crash is caused by dereferencing null pointer (record).

Some analysis:
Prior to the crash there was a C++ exception thrown (possibly memory shortage)
in Table::insert(), thrown from either getFormat() or allocRecordVersion(), and the execution continues in the catch block at Table.cpp, line 3009

At that point variable record is still NULL, but it is passed to garbageCollect
[Table.cpp,line 3017] and subsequently to getSyncPrior()  [Table.cpp, line 2052]
the last function  does not anticipate NULL as parameter and crashes.

How to repeat:
try 3 concurrent "load data infile".

I use 3 different tables with the same structure
create table t1 (guid char (38)) engine=falcon;
create table t2 (guid char (38)) engine=falcon;
create table t3 (guid char (38)) engine=falcon;

and load data from 3 different clients in parallel with 

client 1:
load data local infile 'C:/tmp/uuids.txt' into table t1;
client 2:
load data local infile 'C:/tmp/uuids.txt' into table t2;
client 3:
load data local infile 'C:/tmp/uuids.txt' into table t3;

C:/tmp/uuids.txt is created with 
uuidgen -n10000000 -oC:\tmp\uuids.txt

I set falcon_pagecache_size to 2GB (possible on 64 bit only)

Suggested fix:
guessing:
 
in the catch block starting at Table.cpp, line 3009, do not call garbageCollect(), if record is NULL.
[11 Apr 2008 14:13] Kevin Lewis
Chris,  This seems like it could be a widespread bug.  Thes is an unwritten assumption that the 'leaving' record could be NULL is garbageCollect.  So we should check that before locking syncPrior
[11 Apr 2008 17:41] Christopher Powers
Table::garbageCollect() now checks for null record pointers before initializing the syncPrior object.

  http://lists.mysql.com/commits/45293

ChangeSet@1.2638, 2008-04-11 12:34:18-05:00, cpowers@xeno.mysql.com +1 -0
  Bug#35322, "Falcon duplicate primary keys on updateable views"
 
  Table::checkUniqueRecordVersion() locks record version chain before scanning.

  
  Bug#35982, "Falcon crashes on concurrent load data infile"
  
  Table::garbageCollect() checks for null record pointers before passing to Sync
constructor.
[14 Apr 2008 18:42] Bugs System
Pushed into 6.0.5-alpha
[30 Sep 2008 19:19] Jon Stephens
Documented in the 6.0.5 changelog as follows:

        Concurrent LOAD DATA INFILE statements inserting data
        into Falcon tables could crash the server.