Bug #7718 NDB Fails On Loading Complex Data Warehouse
Submitted: 6 Jan 2005 21:59 Modified: 10 Jan 2005 17:45
Reporter: Dan Gruhn Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:4.1.9 OS:Linux (Fedora Core 1)
Assigned to: Assigned Account CPU Architecture:Any

[6 Jan 2005 21:59] Dan Gruhn
Description:
Mikael/Dean, after your help on the previous problem we had with getting a data warehouse schema loaded, I hope you can help us with this one. Same data warehouse but it is now failing on loading.

We're getting the following error message:

ERROR 1105 at line 1466: Unknown error

when trying to load a complex data warehouse. Below is the cluster config file.

It doesn't seem like we are running into any parameter limits, and we've turned on logging and are not running out of memory, at least any that the logging shows.

# file "config.ini" - showing minimal setup consisting of 1 DB node,
# 1 management server, and 4 MySQL servers.
#
[NDBD DEFAULT]
NoOfReplicas = 1

[MYSQLD DEFAULT]
[NDB_MGMD DEFAULT]
[TCP DEFAULT]

[NDB_MGMD]
HostName = analyst04-lx.group-w-inc.com

[NDBD]
HostName = analyst04-lx.group-w-inc.com
DataDir = /var/lib/mysql-cluster
MaxNoOfTables = 1600
DataMemory = 1280MB
IndexMemory = 1024MB
MaxNoOfAttributes = 8192
MaxNoOfOrderedIndexes = 8192
MaxNoOfUniqueHashIndexes = 8192
MaxNoOfTriggers = 8192
Diskless = Y
StopOnError = N

[MYSQLD]
HostName = class01-lx.group-w-inc.com

[MYSQLD]
HostName = class02-lx.group-w-inc.com

[MYSQLD]
HostName = class03-lx.group-w-inc.com

[MYSQLD]
HostName = dgruhn-lx.group-w-inc.com

[MYSQLD]
HostName = analyst04-lx.group-w-inc.com

How to repeat:
I'm also going to attach a tarball with a directory that contains the necessary files plus a bash script called "testload" which should recreate the error.

In the set of files, the file "dw_data" is the mysql command input which, after everything is set up, tries to load in the data. There is also a "dw_data.short" file which will take you just up to the failure point.

Suggested fix:
If there is a parameter to change, please let us know.  I hope this be fixed (if it is a bug) in the 4.1.9 release.
[6 Jan 2005 22:04] Dan Gruhn
My test data file is about 15M gzipped.  Is there any other way I get submit it?
[7 Jan 2005 13:02] Jonas Oreland
Hi,

Could you make the testdata available to ftp?

/Jonas
[7 Jan 2005 13:12] Dan Gruhn
Jonas,

I have uploaded this file to ftp://ftp.mysql.com/pub/mysql/upload as dgruhn.testdata.tar.gz.  I realized that it  is also part of the ongoing Issue 4304 that Dean Ellis and Mikael Ronström had been helping me with and added this information there.  I hope I'm not getting a lot of people running around duplicating efforts.

Dan
[9 Jan 2005 17:25] Jonas Oreland
I narrowed down the bug to the following:

create table t1(a int primary key, b int not null, unique(b)) type=ndb;
create table t2(a int primary key, b int not null, unique(b)) type=ndb;
insert into t2 values (1,1), (2,2), (3,1), (4,4), (5,1);
insert ignore into t1 select * from t2; // Boom crash internal error
[9 Jan 2005 19:23] Jonas Oreland
That should be:

create table t1(a int primary key, b int not null, unique(b)) type=ndb;
create table t2(a int primary key, b int not null) type=ndb;
insert into t2 values (1,1), (2,2), (3,1);
insert ignore into t1 select * from t2; // Boom crash internal error
[10 Jan 2005 11:09] Jonas Oreland
The bug is unhandled duplicate key in unique constraint when using insert ignore.
Workaround: load into myisam and alter into ndbcluster.

--- However, this revealed a newly introduced bug in alter table.
Which is solved bug#7761.
This also has a workaround for 4.1.8.

--- Complete workaround
echo "set ndb_use_transaction=off;" > alter.sql
grep NDBCLUSTER dw_schema | awk '{ print "alter table " $3 " engine=ndbcluster;" ; }' >> alter.sql
echo "set ndb_use_transaction=on;" >> alter.sql

sed 's/NDBCLUSTER/MYISAM/g' dw_schema > dw_schema.$$
mv dw_schema.$$ dw_schema

echo "mysql dgruhn_dw < alter.sql" >> testload
[10 Jan 2005 14:07] Dan Gruhn
Jonas,

I am trying your work around.  I get the following error message:

Unknown system variable 'ndb_use_transaction'

I tried removing that line, but then I get the following:

ERROR 1297 at line 97: Got temporary error 233 'Out of operation records in transaction coordinator (increase MaxNoOfConcurrentOperations)' from ndbcluster

Perhaps you can help me understand what is going on.

Also, will here be a real fix for this in the 4.1.9 baseline?  If so, when and where will it be available for download?

Dan
[10 Jan 2005 14:23] Jonas Oreland
Sorry, that should be ndb_use_transactions=on/off
Note the 's' in the end.

---
Yes in 4.1.9 there will be a fix so that you don't need the set command before alter.
But I don't know when that will be released
[10 Jan 2005 14:43] Dan Gruhn
Jonas,

That change does allow the data to load, thank you very much.  You should understand that since this is a data warehouse, loading one set of tables does fix things for us because we want to have many sets of data and hope to use multiple servers loading simultaneously.

Will there be a complete fix for this problem and when and where should I be able to download it?

Thanks again,

Dan
[10 Jan 2005 14:44] Dan Gruhn
Jonas,

Sorry, I didn't read your full reply before I sent my response.  My question is, will we be able to do this without using this alter work around?

Dan
[10 Jan 2005 16:31] Dan Gruhn
When will the alter table fix (bug 7761) be available in the product snapshots?
[10 Jan 2005 17:47] Jonas Oreland
bug#7761 is pushed in 4.1.9, that usually means that's it's available for download
  in nightly snapshots max 24 hours later.

I don't know when a 4.1.9 binary release will be available.
[8 Aug 2007 10:24] Hartmut Holzgraefe
retested with 5.0.44, now the mysql client hangs after
  
  mysql> insert ignore into t1 select * from t2; 

a SHOW PROCESSLIST in another client shows the connection
as in SLEEP state though ...
[13 Mar 2014 13:33] Omer Barnir
This bug is not scheduled to be fixed at this time.