Bug #29752 Linefeeds break LOAD XML INFILE
Submitted: 12 Jul 2007 8:32 Modified: 27 Nov 2007 16:23
Reporter: Jon Stephens Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: XML functions Severity:S3 (Non-critical)
Version:5.2.5-bk OS:Linux (32-bit)
Assigned to: Alexander Barkov CPU Architecture:Any
Tags: load infile, XML

[12 Jul 2007 8:32] Jon Stephens
Description:
When running the second test as shown at http://lists.mysql.com/internals/34170:

Content of person2.xml:

<?xml version="1.0"?>

<list>

<person 
person_id='3'><fname>Björn</fname><lname>Bergman</lname></person>

<person 
person_id='4'><fname>Sture</fname><lname>Larsson</lname></person>

</list>

Result:

mysql> load xml local infile '/home/jon/person2.xml' into table test.person rows identified by '<person>';
Query OK, 1 row affected, 2 warnings (0.01 sec)
Records: 2  Deleted: 0  Skipped: 1  Warnings: 2

mysql> show warnings;
+---------+------+------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                            |
+---------+------+------------------------------------------------------------------------------------+
| Warning | 1263 | Column set to default value; NULL supplied to NOT NULL column 'person_id' at row 1 |
| Warning | 1263 | Column set to default value; NULL supplied to NOT NULL column 'person_id' at row 2 |
+---------+------+------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

mysql> select * from test.person;
+-----------+--------+------------+---------------------+
| person_id | fname  | lname      | created             |
+-----------+--------+------------+---------------------+
|         1 | Erik   | Wetterberg | 2007-07-12 10:15:49 |
|         2 | Sven   | Svensson   | 2007-07-12 10:15:49 |
|         0 | Björn | Bergman    | 2007-07-12 10:17:36 |
+-----------+--------+------------+---------------------+
3 rows in set (0.00 sec)

Now, remove the linefeeds, so that the content of person2.xml is as shown here:

<?xml version="1.0"?>

<list>

<person person_id='3'><fname>Björn</fname><lname>Bergman</lname></person>

<person person_id='4'><fname>Sture</fname><lname>Larsson</lname></person>

</list>

Delete the row whose person_id is 0:

mysql> delete from person where person_id=0;
Query OK, 1 row affected (0.00 sec)

mysql> select * from test.person;
+-----------+-------+------------+---------------------+
| person_id | fname | lname      | created             |
+-----------+-------+------------+---------------------+
|         1 | Erik  | Wetterberg | 2007-07-12 10:15:49 |
|         2 | Sven  | Svensson   | 2007-07-12 10:15:49 |
+-----------+-------+------------+---------------------+
2 rows in set (0.00 sec)

Now rerun the LOAD XML INFILE statement; the result is correct, as shown:

mysql> load xml local infile '/home/jon/person2.xml' into table test.person rows identified by '<person>';
Query OK, 2 rows affected (0.00 sec)
Records: 2  Deleted: 0  Skipped: 0  Warnings: 0

mysql> select * from test.person;
+-----------+--------+------------+---------------------+
| person_id | fname  | lname      | created             |
+-----------+--------+------------+---------------------+
|         1 | Erik   | Wetterberg | 2007-07-12 10:15:49 |
|         2 | Sven   | Svensson   | 2007-07-12 10:15:49 |
|         3 | Björn | Bergman    | 2007-07-12 10:19:57 |
|         4 | Sture  | Larsson    | 2007-07-12 10:19:57 |
+-----------+--------+------------+---------------------+
4 rows in set (0.00 sec)

mysql> select version(), now();
+-------------------+---------------------+
| version()         | now()               |
+-------------------+---------------------+
| 5.2.5-alpha-debug | 2007-07-12 10:30:43 |
+-------------------+---------------------+
1 row in set (0.00 sec)

How to repeat:
See description.

Suggested fix:
Linefeeds in the middle of tags should not cause the tags to be treated as closed.
[19 Jul 2007 9:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31155

ChangeSet@1.2537, 2007-07-19 14:09:00+05:00, bar@mysql.com +3 -0
  Bug#29752 Linefeeds break LOAD XML INFILE
  Problem: Linefeed, newline and tab characters where
  not considered as separators. Only space character
  worked as separator.
  Fix: convert linefeed, newline and tab characters to space
  when loading the next character from the input stream.
  This is safe, because these characters are equal from the
  point of view of XML.
[19 Jul 2007 10:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31162

ChangeSet@1.2537, 2007-07-19 15:27:49+05:00, bar@mysql.com +3 -0
  Bug#29752 Linefeeds break LOAD XML INFILE
  Problem: Linefeed, newline and tab characters where
  not considered as separators. Only space character
  worked as separator.
  Fix: convert linefeed, newline and tab characters to space
  when loading the next character from the input stream.
  This is safe, because these characters are equal from the
  point of view of XML.
[19 Jul 2007 11:03] Alexey Botchkov
ok to push
[19 Jul 2007 11:09] Alexander Barkov
Pushed into 5.2.5-rpl
[27 Nov 2007 10:53] Bugs System
Pushed into 6.0.4-alpha
[27 Nov 2007 16:23] Jon Stephens
Documented bugfix in 6.0.4 changelog.