Bug #29752 Linefeeds break LOAD XML INFILE
Submitted: 12 Jul 2007 10:32 Modified: 27 Nov 2007 17:23
Reporter: Jon Stephens
Status: Closed
Category:Server: XML Severity:S3 (Non-critical)
Version:5.2.5-bk OS:Linux (32-bit)
Assigned to: Alexander Barkov Target Version:
Tags: load infile, XML

[12 Jul 2007 10:32] Jon Stephens
Description:
When running the second test as shown at http://lists.mysql.com/internals/34170:

Content of person2.xml:

<?xml version="1.0"?>

<list>

<person 
person_id='3'><fname>Björn</fname><lname>Bergman</lname></person>

<person 
person_id='4'><fname>Sture</fname><lname>Larsson</lname></person>

</list>

Result:

mysql> load xml local infile '/home/jon/person2.xml' into table test.person rows
identified by '<person>';
Query OK, 1 row affected, 2 warnings (0.01 sec)
Records: 2  Deleted: 0  Skipped: 1  Warnings: 2

mysql> show warnings;
+---------+------+------------------------------------------------------------------------------------+
| Level   | Code | Message                                                               
            |
+---------+------+------------------------------------------------------------------------------------+
| Warning | 1263 | Column set to default value; NULL supplied to NOT NULL column
'person_id' at row 1 |
| Warning | 1263 | Column set to default value; NULL supplied to NOT NULL column
'person_id' at row 2 |
+---------+------+------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

mysql> select * from test.person;
+-----------+--------+------------+---------------------+
| person_id | fname  | lname      | created             |
+-----------+--------+------------+---------------------+
|         1 | Erik   | Wetterberg | 2007-07-12 10:15:49 |
|         2 | Sven   | Svensson   | 2007-07-12 10:15:49 |
|         0 | Björn | Bergman    | 2007-07-12 10:17:36 |
+-----------+--------+------------+---------------------+
3 rows in set (0.00 sec)

Now, remove the linefeeds, so that the content of person2.xml is as shown here:

<?xml version="1.0"?>

<list>

<person person_id='3'><fname>Björn</fname><lname>Bergman</lname></person>

<person person_id='4'><fname>Sture</fname><lname>Larsson</lname></person>

</list>

Delete the row whose person_id is 0:

mysql> delete from person where person_id=0;
Query OK, 1 row affected (0.00 sec)

mysql> select * from test.person;
+-----------+-------+------------+---------------------+
| person_id | fname | lname      | created             |
+-----------+-------+------------+---------------------+
|         1 | Erik  | Wetterberg | 2007-07-12 10:15:49 |
|         2 | Sven  | Svensson   | 2007-07-12 10:15:49 |
+-----------+-------+------------+---------------------+
2 rows in set (0.00 sec)

Now rerun the LOAD XML INFILE statement; the result is correct, as shown:

mysql> load xml local infile '/home/jon/person2.xml' into table test.person rows
identified by '<person>';
Query OK, 2 rows affected (0.00 sec)
Records: 2  Deleted: 0  Skipped: 0  Warnings: 0

mysql> select * from test.person;
+-----------+--------+------------+---------------------+
| person_id | fname  | lname      | created             |
+-----------+--------+------------+---------------------+
|         1 | Erik   | Wetterberg | 2007-07-12 10:15:49 |
|         2 | Sven   | Svensson   | 2007-07-12 10:15:49 |
|         3 | Björn | Bergman    | 2007-07-12 10:19:57 |
|         4 | Sture  | Larsson    | 2007-07-12 10:19:57 |
+-----------+--------+------------+---------------------+
4 rows in set (0.00 sec)

mysql> select version(), now();
+-------------------+---------------------+
| version()         | now()               |
+-------------------+---------------------+
| 5.2.5-alpha-debug | 2007-07-12 10:30:43 |
+-------------------+---------------------+
1 row in set (0.00 sec)

How to repeat:
See description.

Suggested fix:
Linefeeds in the middle of tags should not cause the tags to be treated as closed.
[19 Jul 2007 11:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31155

ChangeSet@1.2537, 2007-07-19 14:09:00+05:00, bar@mysql.com +3 -0
  Bug#29752 Linefeeds break LOAD XML INFILE
  Problem: Linefeed, newline and tab characters where
  not considered as separators. Only space character
  worked as separator.
  Fix: convert linefeed, newline and tab characters to space
  when loading the next character from the input stream.
  This is safe, because these characters are equal from the
  point of view of XML.
[19 Jul 2007 12:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31162

ChangeSet@1.2537, 2007-07-19 15:27:49+05:00, bar@mysql.com +3 -0
  Bug#29752 Linefeeds break LOAD XML INFILE
  Problem: Linefeed, newline and tab characters where
  not considered as separators. Only space character
  worked as separator.
  Fix: convert linefeed, newline and tab characters to space
  when loading the next character from the input stream.
  This is safe, because these characters are equal from the
  point of view of XML.
[19 Jul 2007 13:03] Alexey Botchkov
ok to push
[19 Jul 2007 13:09] Alexander Barkov
Pushed into 5.2.5-rpl
[27 Nov 2007 11:53] Bugs System
Pushed into 6.0.4-alpha
[27 Nov 2007 17:23] Jon Stephens
Documented bugfix in 6.0.4 changelog.