Bug #70106 mysqlimport fails when the replace option is used and a column list
Submitted: 21 Aug 2013 13:53 Modified: 22 Aug 2013 18:12
Reporter: Philip Orleans Email Updates:
Status: Verified Impact on me:
Category:MySQL Server: Command-line Clients Severity:S3 (Non-critical)
Version:5.7.1-m11, 5.7.2-m12 OS:Any
Assigned to: CPU Architecture:Any
Tags: load data infile mysqlimport

[21 Aug 2013 13:53] Philip Orleans
This command fails, but it should not:

mysqlimport --local --compress --ignore-lines=0 --replace  --fields-terminated-by=',' --lines-terminated-by='\n'  lrn /usr/src/lrn.txt --columns=did,lrn,ocn,grtype
mysqlimport: Error: 1262, Row 1 was truncated; it contained more data than there were input columns, when using table: lrn

The problem is that the table has an extra column. if I add that column to the column-list, it works. But that is wrong, since my file needs to contain the extra information, and it does, but the meaning is not applicable, so it should not be inserted, it should be ignored. The extra columns not mentioned should fetch their defaults or be ignored.
The "replace" should use only the column-list supplied, and ignore anything else.

How to repeat:
  `did` varchar(10) NOT NULL DEFAULT '',
  `lrn` varchar(10) DEFAULT NULL,
  `ocn` varchar(5) DEFAULT NULL,
  `grtype` int(11) NOT NULL,
  `dr` smallint(6) NOT NULL DEFAULT '3',
  PRIMARY KEY (`did`)

cat lrn.txt

cd /usr/src
mysqlimport --local --compress --ignore-lines=0 --replace  --fields-terminated-by=',' --lines-terminated-by='\n'  lrn /usr/src/lrn.txt --columns=did,lrn,ocn,grtype

Suggested fix:
please construct the "update" corresponding to "--replace" using only the column list supplied and ignore the extra columns in the file.

This is a very important bug for any model that entails using hundreds of millions of rows in a large distributed environment, since the speed of "load data infile" is unmatched. By the way, before you ask, the bug is not in mysqlimport, but in the core mysql.
[22 Aug 2013 18:12] Umesh Shastry
Hello Philip,

Thank you for the report.
Verified as described.

[22 Aug 2013 18:18] Umesh Shastry

Other than adding actual column names etc..another way is to add @variable i.e --columns=did,lrn,ocn,grtype.. and add @variableN in order to skip those columns you do not need and fetch their defaults...

[ushastry@cluster-repo mysql-5.7.2-m12]$
[ushastry@cluster-repo mysql-5.7.2-m12]$ bin/mysqlimport -uroot -p --local --compress --ignore-lines=0 --replace  --fields-terminated-by=',' --lines-terminated-by='\n' test /tmp/lrn.txt --columns=did,lrn,ocn,grtype,@variable1
Enter password:
test.lrn: Records: 2  Deleted: 0  Skipped: 0  Warnings: 0
[ushastry@cluster-repo mysql-5.7.2-m12]$


mysql> select * from lrn;
| did        | lrn        | ocn  | grtype | dr |
| 2012000002 | 2018446199 | 8824 |      1 |  3 |
| 2012000004 | 7327919933 | 7058 |      0 |  3 |
2 rows in set (0.00 sec)

Imho - mysqlimport client provides a command-line interface to the LOAD DATA INFILE SQL statement and hence it should also have some kind of "SET" option which can help avoid this..