MySQL Bugs: #4479: all nodes in cluster crash from 'load data infile' with large insert statement

Bug #4479	all nodes in cluster crash from 'load data infile' with large insert statement
Submitted:	9 Jul 2004 0:38	Modified:	30 Jul 2004 16:14
Reporter:	Devananda v	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	4.1.3	OS:	Linux (Mandrake 9.2)
Assigned to:	Magnus Blåudd	CPU Architecture:	Any

Description:
I'm running cluster on 5 identical computers (single Pentium Xeon 2.8Ghz, 1G ram, 80G HD) - 1 MGM node, 2 DB nodes, 2 API nodes. I have a dump file from our live servers, with a table with 180mil rows - I've isolated two insert statements from this file, one is 1001462 bytes, the other 999896, in plain text. Format of these files is:

  SET AUTOCOMMIT=1;
  INSERT INTO 'user_hiscores' VALUES (x,x,x),(x,x,x),.......,(x,x,x);

Each file is just one single insert statement on one line with 47654 data points to insert in the first file and 46102 in the second. table structure is:

  username varchar(20) NOT NULL default '',
  game_id int(5) NOT NULL default '0',
  score int(10) NOT NULL default '0',

The error message reported is:

Date/Time: Wednesday 7 July 2004 - 17:31:37
Type of error: error
Message: No more free UNDO log
Fault ID: 2312
Problem data: There are more than 1Mbyte undolog writes outstanding
Object of reference: DBACC (Line: 8600) 0x00000002
ProgramName: NDB Kernel
ProcessID: 2352
TraceFile: NDB_TraceFile_3.trace
***EOM*** 

The trace file is 1.9MB so I am refraining from pasting it here.

How to repeat:
I have tried this many times now, with 3 results.

First step:
Start the cluster, verify that the API nodes are connected, drop and create the table, and run the insert via 
  mysql neopets < a.sql

Most common result is both DB nodes crash, displaying
  Error handler shutting down system
  Error handler shutdown completed - exiting
and entering the information I posted above in their error.log files.

On a few occasions, the API node responds with
  ERROR 1297 at line 2: Got temporary error 677 'Index UNDO buffers overloaded' from ndbcluster
When I try to insert again, both DB nodes crash as above.

On rare occasions it is successful the first time. I then verify that all data points are inserted by executing
  select count(username) from user_hiscores;
  select username from user_hiscores limit 5;
on the second API node. However in this case, the cluster still crashes when I attempt the second insert. I have had at least one success inserting each file first, with failure both times when then inserting the alternate file.

Suggested fix:
On the email list, Mikael Ronstrom suggested to me to increase the size of ZUNDOPAGESIZE at line 197 in Dbacc.hpp. I did this, recompiled and copied the binaries to all my servers, but did not notice any difference.

NDB_Trace

Attachment: NDB_TraceFile_3_cropped.trace (application/octet-stream, text), 142.42 KiB.

Hi,

thats a really large insert. :)

As a workaround I would suggest that the INSERT are divided into smaller pieces. Recomended number of VALUES are aprox 1024.

If that does not work for you, could you please upload a test script to our upload area (ftp://ftp.mysql.com/pub/mysql/upload/) so that we can reproduce the problem. Please include table creation script as well as the data you are trying to insert. There is no need to upload the tracefile since we know from your bug report where the crash occured.

Thanks!

Hi! :)

I have written a little script to extract the values from our dumpfile into a '1 value per line' file, and another script to create arbitrary length insert statements from that file. So far, this has worked fine with 1,000 values per insert statement. I have also been successful splitting the dumpfile into as many pieces as I have API nodes and running this script simultaneously on all servers. The limit at this point is simply how much RAM I have (and the DataMemory and IndexMemory settings)! This is a rather long process to have to go through, and I'd hate to have had to learn all this when I needed to restore lost data in a timely way! 

I'd be happy to post those scripts if they would be helpful to anyone else.

If I may make a suggestion, perhaps one solution is to add an option to mysqldump to specify the number of values per insert statement generated ;)

Devananda

I have now committed a test case that reproduces this bug (mikael:1.1981)

Bug fixed and patch pushed to 4.1 together with a test program
Later an improvement will be made for the fix which decreases the
amount of log information.