Bug #11151 LOAD DATA INFILE commits transaction in 5.0
Submitted: 8 Jun 2005 0:08 Modified: 7 Oct 2006 6:24
Reporter: Kolbe Kegel Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.25-bk/5.1-bk OS:Any (Any)
Assigned to: Guilhem Bichot CPU Architecture:Any

[8 Jun 2005 0:08] Kolbe Kegel
Description:
In 5.0.6 a LOAD DATA INFILE operation commits the current transaction. No error is given to indicate that a non-transaction-safe operation has been executed.

This behavior did not exist in 4.1.12. In 4.1.12 a LOAD DATA INFILE operation could be rolled back.

How to repeat:
cat > b.in
1
2
3
4
5
^D

CREATE TABLE b (i int) ENGINE=InnoDB;
LOAD DATA LOCAL INFILE 'b.in' INTO TABLE b;
ROLLBACK;
SELECT i FROM b;

All values remain in table.
[10 Nov 2005 7:42] Kolbe Kegel
Better test case...

cat > /tmp/t11151.in
1
2
3
4
5
^D

DROP TABLE IF EXISTS t11151;
CREATE TABLE t11151 (i int) ENGINE=InnoDB;

START TRANSACTION;
INSERT INTO 11151 VALUES (0);
ROLLBACK;
SELECT i FROM t11151;

-- b is empty, of course

START TRANSACTION;
INSERT INTO t11151 VALUES (0);
LOAD DATA LOCAL INFILE '/tmp/t11151.in' INTO TABLE t11151;
INSERT INTO t11151 VALUES (6);
ROLLBACK;
SELECT i FROM t11151;

-- not only is b not empty, but it contains the row that was inserted before the LOAD DATA INFILE, *and* the row that was inserted afterwards

-- REMINDER... this is not broken in 4.1 or 4.0 (i.e. ROLLBACK works as expected)
-- at the very least, the documentation needs to be updated
[15 Nov 2005 12:39] Sergei Golubchik
It's an NDB "optimization". As NDB doesn't handle large inserts all too well, MySQL now commits during load data every N'th insert. (N=10000 ? don't remember)

I personally think it's a bug. This optimization is ok to do for ALTER TABLE, when nobody cares about what's inserted into temporary table, in the case of error it's just dropped.

But LOAD DATA is justa normal DML, like INSERT (or INSERT ... SELECT), and it should be possible to rollback it.
[28 Mar 2006 18:43] Chad MILLER
I'm looking at this so we can get the release out soon.
[28 Mar 2006 22:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/4251
[28 Mar 2006 22:43] Chad MILLER
Applied to 5.0.20.
[29 Mar 2006 15:36] Chad MILLER
Reverting this patch.  It breaks replication, which relies on commit points to sync data to slaves.

We'll continue to work on it.
[29 Mar 2006 16:11] Chad MILLER
See also bug#11401 and the "rpl_innodb" test.
[6 Apr 2006 18:50] Chad MILLER
This is behavior that we can not fix.  Extremely large transactions cause several storage engines to behave very poorly, and we consider that more important than making the rare "load data infile" transaction-safe.

This must be documented, since it is surprising behavior.
[6 Apr 2006 19:37] Paul DuBois
I have added LOAD DATA INFILE to the list of statements
in the manual that cause implicit commit.
[7 Apr 2006 15:43] Jon Stephens
Reassigning to myself, changing category From Server to Server:Docs and status from WontFix to Open so I'll remember to double-check Cluster Limitations to be sure the items Tomas mentioned are in there...
[8 Apr 2006 5:00] Jon Stephens
Thank you for your bug report. This issue has been addressed in the
documentation. The updated documentation will appear on our website
shortly, and will be included in the next release of the relevant
product(s).

Additional info:

Added info regarding handling of transactional statements from Tomas to Cluster Limitations section of Manual. Closed.
[19 Apr 2006 13:31] Harrison Fisk
This is yet another one of those changes that makes no sense to me (such as SHOW STATUS defaulting to LOCAL).  It will break people's applications when upgrading for no real apparent reason.  I know of many applications that do LOAD DATA, check the warnings, and then rollback if neccesary, and this breaks them silently.

The reasoning behind it is pretty silly too:

"Extremely large transactions cause
several storage engines to behave very poorly, and we consider that more
important than making the rare "load data infile" transaction-safe."  

Okay, so document that, rather than forcing all of the storage engines to do this.  For example, InnoDB, the most common transactional engine, has no issue with big transactions generally.  Even with the ones that can't handle it, it is better to fail gracefully and rollback the entire thing, rather than ending up with some unknown amount of rows committed.  LOAD DATA breaks the C promise of ACID  now without this fix.

Perhaps it would be better, in a future version, to add an option to LOAD DATA to allow to to commit every X number of rows rather than making that the default and breaking many applications?
[8 May 2006 14:55] Douglas Campbell
I don't understand the reasoning here. If I can wrap a transaction around a series of SQL Inserts , why can't I do it around a series of 'load data infiles' that has exactly the same effect? This should be fixed, not documented, because it is a defect, not a feature.
[23 May 2006 14:32] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/6775
[11 Jul 2006 20:01] Wei Li
Is this bug really fixed?  I am still seeing the same behavior in 5.1.
[12 Jul 2006 18:15] James Day
Wei Li,

No, the behavior hasn't been changed. All that the patch did was change the code so that LOAD DATA INFILE was marked as unsafe for transactions in stored procedures.

If the commit only happened after the threshold was passed, would that meet your needs? Would it if a user-adjustable threshold was used?
[12 Jul 2006 18:52] Wei Li
Hi James:

Load data infile should be protected by transactions can be rollbacked.  I am looking at the replication right now in 5.1.  Almost all DML failure situations has been solved by two phase commit.  The problem for LOAD is that it is not protected by two phase commit.  So, if a LOAD runs for a long time and crashes, which is likely because it is a DML, then the slave side can lose all changes involved.

Thanks,
Wei
[15 Aug 2006 23:37] Lachlan Mulcahy
This bug is being reopened to revisit the decision to lose ACID compliance in LOAD DATA INFILE by allowing it to perform it's own implicit commits.

This also breaks old applications moving forward to 5.0.
[29 Aug 2006 9:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/10988

ChangeSet@1.2286, 2006-08-29 11:20:26+02:00, guilhem@gbichot3.local +8 -0
  Fix for BUG#11151 "LOAD DATA INFILE commits transaction in 5.0".
  In 5.0 we made LOAD DATA INFILE autocommit in all engines, while
  only NDB wanted that. Users and trainers complained that it affected
  InnoDB and was a change compared to 4.1 where only NDB autocommitted.
  To revert to the behaviour of 4.1, we move the autocommit logic out of mysql_load() into
  ha_ndbcluster::external_lock().
  As there is no "commit the full transaction at end" anymore, LOAD
  DATA INFILE can be allowed in stored functions.
  This patch is for discussion; I don't even know if it should go into
  5.0 (as it is a behaviour change in a GA) or in 5.1.
  Note: ha_ndbcluster::has_transactions() does not give reliable results
  because it says "yes" even if transactions are disabled in this engine...
[30 Aug 2006 0:56] Arjen Lentz
Regarding performance problems with big transactions in some storage engines using this command, a suggestion could be added to the manual to import into a temporary MyISAM table, and do an INSERT-SELECT when succesful. This also evades the problem that originally caused this bug report.
[5 Sep 2006 13:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/11399

ChangeSet@1.2286, 2006-09-05 15:11:36+02:00, guilhem@gbichot3.local +10 -0
    Fix for BUG#11151 "LOAD DATA INFILE commits transaction in 5.0".
    In 5.0 we made LOAD DATA INFILE autocommit in all engines, while
    only NDB wanted that. Users and trainers complained that it affected
    InnoDB and was a change compared to 4.1 where only NDB autocommitted.
    To revert to the behaviour of 4.1, we move the autocommit logic out of mysql_load() into
    ha_ndbcluster::external_lock().
    Note: even though there is no "commit the full transaction at end"
    anymore, LOAD DATA INFILE stays disabled in routines (re-entrency
    problems per a comment of Pem).
    Note: ha_ndbcluster::has_transactions() does not give reliable results
    because it says "yes" even if transactions are disabled in this engine...
  This is a cset against 5.1 which I'll now instead commit in 5.0
  with the needed approvals.
[5 Sep 2006 14:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/11407

ChangeSet@1.2259, 2006-09-05 16:52:05+02:00, guilhem@gbichot3.local +10 -0
  Fix for BUG#11151 "LOAD DATA INFILE commits transaction in 5.0".
  In 5.0 we made LOAD DATA INFILE autocommit in all engines, while
  only NDB wanted that. Users and trainers complained that it affected
  InnoDB and was a change compared to 4.1 where only NDB autocommitted.
  To revert to the behaviour of 4.1, we move the autocommit logic out of mysql_load() into
  ha_ndbcluster::external_lock().
  The result is that LOAD DATA INFILE commits all uncommitted changes
  of NDB if this is an NDB table, its own changes if this is an NDB
  table, but does not affect other engines.
  Note: even though there is no "commit the full transaction at end"
  anymore, LOAD DATA INFILE stays disabled in routines (re-entrency
  problems per a comment of Pem).
  Note: ha_ndbcluster::has_transactions() does not give reliable results
  because it says "yes" even if transactions are disabled in this engine...
[18 Sep 2006 21:35] Timothy Smith
Pushed to 5.0.26 and 5.1.12
[7 Oct 2006 6:24] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

I have updated the following sections of the Manual:

5.0/5.1 Cluster Limitations
5.0.x/5.0.26 & 5.1.x/5.1.12 ChangeLogs
5.0/5.1 InnoDB Statements Causing Implicit Commits   
5.0/5.1 [General] Statements Causing Implicit Commits

where I have stated the following (with modifications as appropriate to each section):

'In 5.0 prior to 5.0.26, and in 5.1 prior to 5.1.12, LOAD DATA INFILE caused an implicit commit regardless of the storage engine used (including InnoDB). In 5.0 beginning with 5.0.26, and in 5.1 beginning with 5.1.12, this statement now causes an implicit commit only for tables using the NDB storage engine.'

As this didn't affect 4.1 or earlier versions, no changes have been made to the 4.1 Manual.

The Docs changeset is available here: http://lists.mysql.com/commits/13289

Please advise if any further changes relating to this issue are needed in the docs.

Thanks!
[7 Oct 2006 8:56] Guilhem Bichot
Jon,
what you write looks ok to me.