Bug #67541 LOAD XML cannot load data from mysqldump --xml
Submitted: 9 Nov 2012 17:30 Modified: 9 Nov 2012 18:04
Reporter: Kolbe Kegel Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:5.5.28 OS:Any
Assigned to: CPU Architecture:Any
Tags: mysqldump, XML

[9 Nov 2012 17:30] Kolbe Kegel
Description:
The LOAD XML statement (https://dev.mysql.com/doc/refman/5.5/en/load-xml.html) is described as "the complement of running the mysql client in XML output mode". Unfortunately, it fails miserable for empty strings and NULL columns. It is impossible to use LOAD XML to reliably re-load the output of mysql --xml *or* mysqldump --xml

How to repeat:
DROP TABLE IF EXISTS xml;
CREATE TABLE `xml` (
  `col1` char(3) DEFAULT NULL,
  `col2` char(3) DEFAULT NULL,
  `col3` char(3) DEFAULT NULL
) ENGINE=InnoDB;

INSERT INTO `xml` VALUES ('abc','def','ghi'),('jkl','','mno'),('pqr',NULL,'stu');
SELECT * FROM xml;

#shell mysql --xml -e 'select * from xml' | tee /tmp/dump.xml

truncate xml;
load xml infile '/tmp/dump.xml' into table xml;
select * from xml;

mysql 5.5.28-log (root) [test]> CREATE TABLE `xml` (
    ->   `col1` char(3) DEFAULT NULL,
    ->   `col2` char(3) DEFAULT NULL,
    ->   `col3` char(3) DEFAULT NULL
    -> ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.10 sec)

mysql 5.5.28-log (root) [test]> INSERT INTO `xml` VALUES ('abc','def','ghi'),('jkl','','mno'),('pqr',NULL,'stu');
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql 5.5.28-log (root) [test]> SELECT * FROM xml;
+------+------+------+
| col1 | col2 | col3 |
+------+------+------+
| abc  | def  | ghi  |
| jkl  |      | mno  |
| pqr  | NULL | stu  |
+------+------+------+
3 rows in set (0.00 sec)

kolbe@prosimmon 5.5 $ mysql --xml -e 'select * from xml' | tee /tmp/dump.xml 
<?xml version="1.0"?>

<resultset statement="select * from xml
" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <row>
        <field name="col1">abc</field>
        <field name="col2">def</field>
        <field name="col3">ghi</field>
  </row>

  <row>
        <field name="col1">jkl</field>
        <field name="col2"></field>
        <field name="col3">mno</field>
  </row>

  <row>
        <field name="col1">pqr</field>
        <field name="col2" xsi:nil="true" />
        <field name="col3">stu</field>
  </row>
</resultset>

mysql 5.5.28-log (root) [test]> truncate xml;
Query OK, 0 rows affected (0.01 sec)

mysql 5.5.28-log (root) [test]> load xml infile '/tmp/dump.xml' into table xml;
Query OK, 3 rows affected (0.00 sec)
Records: 3  Deleted: 0  Skipped: 0  Warnings: 0

mysql 5.5.28-log (root) [test]> select * from xml;
+------+------+------+
| col1 | col2 | col3 |
+------+------+------+
| abc  | def  | ghi  |
| jkl  | NULL | mno  |
| pqr  | stu  | NULL |
+------+------+------+
3 rows in set (0.00 sec)

Suggested fix:
The LOAD XML statement must be fixed to correctly handle these things:

o Empty strings
o NULL values
o All permutations of output from mysql --xml and mysqldump --xml
[9 Nov 2012 17:40] Kolbe Kegel
General failure of LOAD XML to properly handle empty elements (<element/>) filed separately as bug #67542.
[9 Nov 2012 18:04] Sveta Smirnova
Thank you for the report.

Verified as described.
[9 Jan 2013 17:37] John Thur
Issue with Empty Tags

LOAD  XML CONCURRENT LOCAL INFILE 'filename.xml' REPLACE  INTO TABLE TableName ROWS IDENTIFIED BY '</RecordTagName>'

<FiledTagName></FieldTagName>

If a tag is empty will load tag from a previous record (not the prior with data) I believe this is a buffer issue because other record tags with a different value for fieldname exist after the bad data that is being inserted and the record for which it is inserted.