Bug #29442 SELECT INTO OUTFILE FIELDS ENCLOSED BY '0' corrupts numeric fields
Submitted: 29 Jun 2007 10:37 Modified: 9 Jul 2007 1:10
Reporter: Gleb Shchepa Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:4.1, 5.0, 5.1 OS:Any
Assigned to: Gleb Shchepa CPU Architecture:Any

[29 Jun 2007 10:37] Gleb Shchepa
Description:
The SELECT INTO OUTFILE FIELDS ENCLOSED BY '0' statement doesn't escape a zero characters in the text representation of numeric constants. So, a data file will be corrupted.

Example:

  SELECT 10, 9 INTO OUTFILE 't.txt' FIELDS ENCLOSED BY '0'

produces the 't.txt' file containing:

  0100 090

Next call to the LOAD DATA INFILE FIELDS ENCLOSED BY '0' statement interprets this 't.txt' file byte by byte as:

1. 0 --> ENCLOSED BY symbol, opening the field;
2. 1 --> '1' digit;
3. 00 --> doubled ENCLOSED BY character, will be replaced with single '0' digit;
4. tabulation --> trailing space of the field;
5. 0 --> ENCLOSED BY character, end of the field;
6. 9 --> unexpected character (tabulation is omitted);
7. 0 --> ENCLOSED BY character, opening the field;

Thus, second field of value 9 will be lost: loaded record will be equal to (10, NULL).

How to repeat:
mysql> SELECT VERSION();
+--------------+
| VERSION()    |
+--------------+
| 5.0.46-debug | 
+--------------+
1 row in set (0.08 sec)

mysql> CREATE TABLE t1 (c1 INT, c2 INT);
Query OK, 0 rows affected (0.04 sec)

mysql> CREATE TABLE t2 (id INT, c2 INT);
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO t1 (c1, c2) VALUES (10, 9);
Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM t1;
+------+------+
| c1   | c2   |
+------+------+
|   10 |    9 | 
+------+------+
1 row in set (0.00 sec)

mysql> SELECT * INTO OUTFILE 't1.txt' FIELDS ENCLOSED BY '0' FROM t1;
Query OK, 1 row affected (0.00 sec)

mysql> LOAD DATA INFILE 't1.txt' INTO TABLE t2 FIELDS ENCLOSED BY '0';
Query OK, 1 row affected, 2 warnings (0.00 sec)
Records: 1  Deleted: 0  Skipped: 0  Warnings: 2

mysql> SHOW WARNINGS;
+---------+------+--------------------------------------------+
| Level   | Code | Message                                    |
+---------+------+--------------------------------------------+
| Warning | 1265 | Data truncated for column 'id' at row 1    | 
| Warning | 1261 | Row 1 doesn't contain data for all columns | 
+---------+------+--------------------------------------------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM t2;
+------+------+
| id   | c2   |
+------+------+
|   10 | NULL | 
+------+------+
1 row in set (0.00 sec)

Suggested fix:
In the presence of the FIELDS ENCLOSED BY '0' clause text representation of numeric, data etc values should be encoded like string values.
[29 Jun 2007 10:40] Gleb Shchepa
test case

Attachment: 29442.test (application/octet-stream, text), 638 bytes.

[29 Jun 2007 13:28] Sveta Smirnova
Thank you for the report.

Verified as described.
[5 Jul 2007 16:27] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/30395

ChangeSet@1.2518, 2007-07-05 21:31:57+05:00, gshchepa@gleb.loc +3 -0
  Fixed bug #29442.
  The SELECT INTO OUTFILE FIELDS ENCLOSED BY digit or minus sign,
  followed by the same LOAD DATA INFILE statement, corrupted non-string
  fields contained the enclosed character in their text representation.
[5 Jul 2007 22:10] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/30414

ChangeSet@1.2518, 2007-07-06 03:14:32+05:00, gshchepa@gleb.loc +4 -0
  Fixed bug #29442.
  The SELECT INTO OUTFILE FIELDS ENCLOSED BY digit or minus sign,
  followed by the same LOAD DATA INFILE statement, used wrond encoding
  of non-string fields contained the enclosed character in their text
  representation.
  
  Example:
    SELECT 15, 9 INTO OUTFILE 'text' FIELDS ENCLOSED BY '5';
  
  Old encoding result in the text file:
    5155 595
           ^ was decoded as the 1st enclosing character of the 2nd field;
          ^ was skipped as a garbage;
    ^    ^ was decoded as a pair of englosing characters of the 1st field;
        ^   was decoded as traling space of the first field;
      ^^ was decoded as a doubled enclosed character.
  
  New encoding result in the text file:
    51\55 595
    ^   ^ pair of enclosing characters of the 1st field;
      ^^ escaped enclosed character.
[5 Jul 2007 22:39] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/30415

ChangeSet@1.2518, 2007-07-06 03:43:23+05:00, gshchepa@gleb.loc +4 -0
  Fixed bug #29442.
  The SELECT INTO OUTFILE FIELDS ENCLOSED BY digit or minus sign,
  followed by the same LOAD DATA INFILE statement, used wrond encoding
  of non-string fields contained the enclosed character in their text
  representation.
  
  Example:
    SELECT 15, 9 INTO OUTFILE 'text' FIELDS ENCLOSED BY '5';
  
  Old encoded result in the text file:
    5155 595
           ^ was decoded as the 1st enclosing character of the 2nd field;
          ^ was skipped as garbage;
    ^    ^ was decoded as a pair of englosing characters of the 1st field;
        ^   was decoded as traling space of the first field;
      ^^ was decoded as a doubled enclosed character.
  
  New encoded result in the text file:
    51\55 595
    ^   ^ pair of enclosing characters of the 1st field;
      ^^ escaped enclosed character.
[8 Jul 2007 17:28] Bugs System
Pushed into 5.1.21-beta
[8 Jul 2007 17:30] Bugs System
Pushed into 5.0.46
[9 Jul 2007 1:10] Paul DuBois
Noted in 5.0.46, 5.1.21 changelogs.

Use of SELECT ... INTO OUTFILE 'file_name' FIELDS ENCLOSED BY 'c',
where c is a digit or minus sign, followed by LOAD DATA INFILE
'file_name FIELDS ENCLOSED BY 'c', resulted in corrupt data.