Bug #14778 NULL character not recognized when at EOL with LOAD DATA
Submitted: 9 Nov 2005 4:18 Modified: 12 Dec 2005 17:30
Reporter: Raymond DeRoo Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.15 OS:Windows (Windows XP)
Assigned to: Sergey Vlasenko CPU Architecture:Any

[9 Nov 2005 4:18] Raymond DeRoo
Description:
When the NULL character, \N, in a text file appears as the last element it is not treated as a NULL. It is loaded as the default data type for the column.

This problem does not appear in either Linux or OS X versions.

How to repeat:
mysql> CREATE TEMPORARY TABLE tmp ( d DATE, e SMALLINT);
Query OK, 0 rows affected (0.06 sec)

mysql> LOAD DATA INFILE 'c:\\test.txt' INTO TABLE tmp FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n' IGNORE 1 LINES;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Deleted: 0  Skipped: 0  Warnings: 0

mysql> SELECT * FROM tmp;
+------------+------+
| d          | e    |
+------------+------+
| 2005-10-31 |    0 |
| NULL       |    2 |
+------------+------+
2 rows in set (0.00 sec)

----- test.txt -----
d,e
2005-10-31,\N
\N,2

Suggested fix:
Process all NULL characters for LOAD DATA accordingly
[14 Nov 2005 15:49] Sergey Vlasenko
Checked suggested scenario on current 5.0 bk tree (5.0.17).
behavior is different:
1. when importing
==
d,e
2005-10-31,
,2
==
result is
+------------+------+
| d          | e    |
+------------+------+
| 2005-10-31 |    0 |
| 0000-00-00 |    2 |
+------------+------+

2. when importing
==
d,e
2005-10-31,\N
\N,2
==

result is:
+------------+------+
| d          | e    |
+------------+------+
| 2005-10-31 | NULL |
| NULL       |    2 |
+------------+------+

Initial table definition is:
CREATE TEMPORARY TABLE tmp ( d DATE default null, e SMALLINT default null);
[23 Nov 2005 4:22] Lachlan Mulcahy
I tested this in 5.0.16 release and had the same result as Sergey; it would seem this has since been fixed in 5.0.16.
[12 Dec 2005 15:48] Sergey Vlasenko
The problem is reproduced only if text file to be imported is in windows format (\r\n delimits lines) in this case the value of field e is treated as NULL symbol plus CR character, which is not equal to NULL, and so is converted to 0 on import.
the same behavior you can see when you specify '\Nsomesymbols' for one of fields in import file.

There are 2 solutions for it:
1. use LINES TERMINATED BY '\r\n' if import file is in windows format
2. convert import file always to Unix format (\n is line delimiter)