Bug #38091 | LOAD DATA LOCAL INFILE using source column references loads sequential columns | ||
---|---|---|---|
Submitted: | 14 Jul 2008 5:14 | Modified: | 7 Aug 2008 4:19 |
Reporter: | Rusty Weise | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: DML | Severity: | S3 (Non-critical) |
Version: | 5.1.23 | OS: | Windows (Vista Enterprise) |
Assigned to: | CPU Architecture: | Any |
[14 Jul 2008 5:14]
Rusty Weise
[14 Jul 2008 9:02]
Susanne Ebrecht
Many thanks for reporting a bug. Did you set sql_mode=ANSI_QUOTES before executing the query? If not, please give it a try and let us know if this will solve your problem.
[17 Jul 2008 12:58]
Sveta Smirnova
Thank you for the feedback. Please provide output of SHOW CREATE TABLE mytable and first 5 lines from your text file (would be better if you attach them to the report as a separate file).
[19 Jul 2008 20:35]
Rusty Weise
source data
Attachment: mydatabase.csv (application/vnd.ms-excel, text), 4.47 KiB.
[19 Jul 2008 20:36]
Rusty Weise
steps performed to create db, create table, set sql_mode, load data, and resultant set
Attachment: bug38091.txt (text/plain), 1.50 KiB.
[19 Jul 2008 20:39]
Rusty Weise
Note: in the source data, the number of the column in which the data is stored is embedded in the data: col#1,column #2,stat col #3,TableColumn4,column5,column6,col/field/number7,col8,col9,col10,col 11,col12,col 13 1,stat col2,300,41234,this is a var char 550,"varchar,610","another, varchar, 750",boring8,boring9,boring10,1100,more chars with spaces 12,char13 2,"stat,col2",333,41234,this is a var char 551,"varchar,611","another, varchar, 751",boring8,boring9,boring10,1101,more chars with spaces 12,char13
[21 Jul 2008 21:06]
Sveta Smirnova
Thank you for the feedback. If I understood you correctly you want 1 row from the attached csv to be loades as: mysql> select * from mytable limit 1\G *************************** 1. row *************************** col#1: 1 column #2: stat col2 status column #3: 300 TableColumn4: 0 column5: NULL column6: NULL col/field/number7: boring9 col8: NULL col9: NULL col10: boring10 col 11: 1100 col12: more chars with spaces 12 col 13: char13 1 row in set (0.00 sec) But you get results like: mysql> select * from mytable limit 1\G*************************** 1. row *************************** col#1: 1 column #2: stat col2 status column #3: 300 TableColumn4: 41234 column5: NULL column6: NULL col/field/number7: this is a var char 550 col8: NULL col9: NULL col10: varchar,610 col 11: 0 col12: boring8 col 13: boring9 1 row in set (0.01 sec) If this is your concern this is not a bug: see http://dev.mysql.com/doc/refman/5.1/en/load-data.html for details how mysqld uses list of columns. If I understood you in a wrong way, please provide expected and real results of the query.
[22 Jul 2008 5:27]
Rusty Weise
Oh, bummer, I blew it on the table create by including all the columns of the source data *doah*. You can see that the source data has 13 columns. The table has only 9, and they are referenced by column name in the LOAD DATA LOCAL INFILE query. Of _course_ it will stuff NULLs in the missing data, and it appears that the column-referenced LOAD worked as designed in my previous example. Here, I want to do an ETL from 13 fields to 9. I will attach a file with the _correct_ process of creating the toy db, table, performing the query, and resultant set. You will notice that columns 1 - 9 are used in sequential fashion, as opposed to the columns that I specified in the query, 1-4,7,and 10-13.
[22 Jul 2008 5:28]
Rusty Weise
Create db, table, populate wtih LOAD DATA query, and query for resultant set.
Attachment: bug38091_correct_table.txt (text/plain), 1.28 KiB.
[31 Jul 2008 19:39]
Sveta Smirnova
Thank you for the feedback. You have lines like following: 1, 'stat col2', 300, 41234, 'this is a var char 550', 'varchar,610', 0, 'boring8', 'boring9' Notice fields are really terminated by ", " (comma and space) and not "," (comma), so mysqld ignores quotes inside fields.
[7 Aug 2008 4:19]
Rusty Weise
I continue to be unclear, and considering this is now marked as "Not a Bug", I don't expect to achieve resolution to the original problem statement. The data, to which you are referring, is simply a copy of row data from the MySQL Query Browser resultset when I did a, "select * from mytable limit 5"; the source data is well formatted with comma delimited values with optional double-quotes. In as clear of a description as I am able to muster, the problem statement is: My source data is a comma separated value (csv) file with optional double-quote field data modifiers and is comprised of 50 unique columns. The column names are defined with identifiers that contain spaces, reserved characters, and reserved words. My destination table is comprised of a subset of those 50 columns; some of which contains special characters, spaces, and reserved words; but match the source column identifiers verbatim. The order of appearance of the columns in the table does not match the order of appearance in the source csv. Because the order of the source column data is not predictable, I wish to use the "LOAD DATA LOCAL INFILE" command and specify the column names in the source table, which match those in the destination table _identically_ in order to update the destination table by column name reference. The "LOAD DATA LOCAL INFILE" command appears to only load the data in the cells in order of presentation in the source csv. That is to say, if there were 5 columns in the source data and I specified the name of the odd columns to load, I would instead only get the first through third column data regardless of the name of the columns. I don't want to be redundant any more than I already have here, so please refer to comment #0 where I explain the source data table design, and the desired goals. Please, explain how I can be any clearer as to what the root of the problem is and how I might work around this problem using native MySQL commands.