Bug #11151 LOAD DATA INFILE commits transaction in 5.0
Submitted: 8 Jun 2005 2:08 Modified: 7 Oct 2006 8:24
Reporter: Kolbe Kegel
Status: Closed
Category:Server Severity:S2 (Serious)
Version:5.0.25-bk/5.1-bk OS:Any (Any)
Assigned to: Guilhem Bichot Target Version:

[8 Jun 2005 2:08] Kolbe Kegel
Description:
In 5.0.6 a LOAD DATA INFILE operation commits the current transaction. No error is given
to indicate that a non-transaction-safe operation has been executed.

This behavior did not exist in 4.1.12. In 4.1.12 a LOAD DATA INFILE operation could be
rolled back.

How to repeat:
cat > b.in
1
2
3
4
5
^D

CREATE TABLE b (i int) ENGINE=InnoDB;
LOAD DATA LOCAL INFILE 'b.in' INTO TABLE b;
ROLLBACK;
SELECT i FROM b;

All values remain in table.
[10 Nov 2005 8:42] Kolbe Kegel
Better test case...

cat > /tmp/t11151.in
1
2
3
4
5
^D

DROP TABLE IF EXISTS t11151;
CREATE TABLE t11151 (i int) ENGINE=InnoDB;

START TRANSACTION;
INSERT INTO 11151 VALUES (0);
ROLLBACK;
SELECT i FROM t11151;

-- b is empty, of course

START TRANSACTION;
INSERT INTO t11151 VALUES (0);
LOAD DATA LOCAL INFILE '/tmp/t11151.in' INTO TABLE t11151;
INSERT INTO t11151 VALUES (6);
ROLLBACK;
SELECT i FROM t11151;

-- not only is b not empty, but it contains the row that was inserted before the LOAD DATA
INFILE, *and* the row that was inserted afterwards

-- REMINDER... this is not broken in 4.1 or 4.0 (i.e. ROLLBACK works as expected)
-- at the very least, the documentation needs to be updated
[15 Nov 2005 13:39] Sergei Golubchik
It's an NDB "optimization". As NDB doesn't handle large inserts all too well, MySQL now
commits during load data every N'th insert. (N=10000 ? don't remember)

I personally think it's a bug. This optimization is ok to do for ALTER TABLE, when nobody
cares about what's inserted into temporary table, in the case of error it's just dropped.

But LOAD DATA is justa normal DML, like INSERT (or INSERT ... SELECT), and it should be
possible to rollback it.
[28 Mar 2006 20:43] Chad MILLER
I'm looking at this so we can get the release out soon.
[29 Mar 2006 0:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/4251
[29 Mar 2006 0:43] Chad MILLER
Applied to 5.0.20.
[29 Mar 2006 17:36] Chad MILLER
Reverting this patch.  It breaks replication, which relies on commit points to sync data
to slaves.

We'll continue to work on it.
[29 Mar 2006 18:11] Chad MILLER
See also bug#11401 and the "rpl_innodb" test.
[6 Apr 2006 20:50] Chad MILLER
This is behavior that we can not fix.  Extremely large transactions cause several storage
engines to behave very poorly, and we consider that more important than making the rare
"load data infile" transaction-safe.

This must be documented, since it is surprising behavior.
[6 Apr 2006 21:37] Paul DuBois
I have added LOAD DATA INFILE to the list of statements
in the manual that cause implicit commit.
[7 Apr 2006 17:43] Jon Stephens
Reassigning to myself, changing category From Server to Server:Docs and status from
WontFix to Open so I'll remember to double-check Cluster Limitations to be sure the items
Tomas mentioned are in there...
[8 Apr 2006 7:00] Jon Stephens
Thank you for your bug report. This issue has been addressed in the
documentation. The updated documentation will appear on our website
shortly, and will be included in the next release of the relevant
product(s).

Additional info:

Added info regarding handling of transactional statements from Tomas to Cluster
Limitations section of Manual. Closed.
[19 Apr 2006 15:31] Harrison Fisk
This is yet another one of those changes that makes no sense to me (such as SHOW STATUS
defaulting to LOCAL).  It will break people's applications when upgrading for no real
apparent reason.  I know of many applications that do LOAD DATA, check the warnings, and
then rollback if neccesary, and this breaks them silently.

The reasoning behind it is pretty silly too:

"Extremely large transactions cause
several storage engines to behave very poorly, and we consider that more
important than making the rare "load data infile" transaction-safe."  

Okay, so document that, rather than forcing all of the storage engines to do this.  For
example, InnoDB, the most common transactional engine, has no issue with big transactions
generally.  Even with the ones that can't handle it, it is better to fail gracefully and
rollback the entire thing, rather than ending up with some unknown amount of rows
committed.  LOAD DATA breaks the C promise of ACID  now without this fix.

Perhaps it would be better, in a future version, to add an option to LOAD DATA to allow to
to commit every X number of rows rather than making that the default and breaking many
applications?
[8 May 2006 16:55] Douglas Campbell
I don't understand the reasoning here. If I can wrap a transaction around a series of SQL
Inserts , why can't I do it around a series of 'load data infiles' that has exactly the
same effect? This should be fixed, not documented, because it is a defect, not a feature.
[23 May 2006 16:32] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/6775
[11 Jul 2006 22:01] Wei Li
Is this bug really fixed?  I am still seeing the same behavior in 5.1.
[12 Jul 2006 20:15] James Day
Wei Li,

No, the behavior hasn't been changed. All that the patch did was change the code so that
LOAD DATA INFILE was marked as unsafe for transactions in stored procedures.

If the commit only happened after the threshold was passed, would that meet your needs?
Would it if a user-adjustable threshold was used?
[12 Jul 2006 20:52] Wei Li
Hi James:

Load data infile should be protected by transactions can be rollbacked.  I am looking at
the replication right now in 5.1.  Almost all DML failure situations has been solved by
two phase commit.  The problem for LOAD is that it is not protected by two phase commit. 
So, if a LOAD runs for a long time and crashes, which is likely because it is a DML, then
the slave side can lose all changes involved.

Thanks,
Wei
[16 Aug 2006 1:37] Lachlan Mulcahy
This bug is being reopened to revisit the decision to lose ACID compliance in LOAD DATA
INFILE by allowing it to perform it's own implicit commits.

This also breaks old applications moving forward to 5.0.
[29 Aug 2006 11:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/10988

ChangeSet@1.2286, 2006-08-29 11:20:26+02:00, guilhem@gbichot3.local +8 -0
  Fix for BUG#11151 "LOAD DATA INFILE commits transaction in 5.0".
  In 5.0 we made LOAD DATA INFILE autocommit in all engines, while
  only NDB wanted that. Users and trainers complained that it affected
  InnoDB and was a change compared to 4.1 where only NDB autocommitted.
  To revert to the behaviour of 4.1, we move the autocommit logic out of mysql_load() into
  ha_ndbcluster::external_lock().
  As there is no "commit the full transaction at end" anymore, LOAD
  DATA INFILE can be allowed in stored functions.
  This patch is for discussion; I don't even know if it should go into
  5.0 (as it is a behaviour change in a GA) or in 5.1.
  Note: ha_ndbcluster::has_transactions() does not give reliable results
  because it says "yes" even if transactions are disabled in this engine...
[30 Aug 2006 2:56] Arjen Lentz
Regarding performance problems with big transactions in some storage engines using this
command, a suggestion could be added to the manual to import into a temporary MyISAM
table, and do an INSERT-SELECT when succesful. This also evades the problem that
originally caused this bug report.
[5 Sep 2006 15:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/11399

ChangeSet@1.2286, 2006-09-05 15:11:36+02:00, guilhem@gbichot3.local +10 -0
    Fix for BUG#11151 "LOAD DATA INFILE commits transaction in 5.0".
    In 5.0 we made LOAD DATA INFILE autocommit in all engines, while
    only NDB wanted that. Users and trainers complained that it affected
    InnoDB and was a change compared to 4.1 where only NDB autocommitted.
    To revert to the behaviour of 4.1, we move the autocommit logic out of mysql_load()
into
    ha_ndbcluster::external_lock().
    Note: even though there is no "commit the full transaction at end"
    anymore, LOAD DATA INFILE stays disabled in routines (re-entrency
    problems per a comment of Pem).
    Note: ha_ndbcluster::has_transactions() does not give reliable results
    because it says "yes" even if transactions are disabled in this engine...
  This is a cset against 5.1 which I'll now instead commit in 5.0
  with the needed approvals.
[5 Sep 2006 16:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/11407

ChangeSet@1.2259, 2006-09-05 16:52:05+02:00, guilhem@gbichot3.local +10 -0
  Fix for BUG#11151 "LOAD DATA INFILE commits transaction in 5.0".
  In 5.0 we made LOAD DATA INFILE autocommit in all engines, while
  only NDB wanted that. Users and trainers complained that it affected
  InnoDB and was a change compared to 4.1 where only NDB autocommitted.
  To revert to the behaviour of 4.1, we move the autocommit logic out of mysql_load() into
  ha_ndbcluster::external_lock().
  The result is that LOAD DATA INFILE commits all uncommitted changes
  of NDB if this is an NDB table, its own changes if this is an NDB
  table, but does not affect other engines.
  Note: even though there is no "commit the full transaction at end"
  anymore, LOAD DATA INFILE stays disabled in routines (re-entrency
  problems per a comment of Pem).
  Note: ha_ndbcluster::has_transactions() does not give reliable results
  because it says "yes" even if transactions are disabled in this engine...
[18 Sep 2006 23:35] Timothy Smith
Pushed to 5.0.26 and 5.1.12
[7 Oct 2006 8:24] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of
that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version,
including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

I have updated the following sections of the Manual:

5.0/5.1 Cluster Limitations
5.0.x/5.0.26 & 5.1.x/5.1.12 ChangeLogs
5.0/5.1 InnoDB Statements Causing Implicit Commits   
5.0/5.1 [General] Statements Causing Implicit Commits

where I have stated the following (with modifications as appropriate to each section):

'In 5.0 prior to 5.0.26, and in 5.1 prior to 5.1.12, LOAD DATA INFILE caused an implicit
commit regardless of the storage engine used (including InnoDB). In 5.0 beginning with
5.0.26, and in 5.1 beginning with 5.1.12, this statement now causes an implicit commit
only for tables using the NDB storage engine.'

As this didn't affect 4.1 or earlier versions, no changes have been made to the 4.1
Manual.

The Docs changeset is available here: http://lists.mysql.com/commits/13289

Please advise if any further changes relating to this issue are needed in the docs.

Thanks!
[7 Oct 2006 10:56] Guilhem Bichot
Jon,
what you write looks ok to me.